noether.modeling.models.transformer¶
Classes¶
Configuration for a Transformer model. |
|
Implementation of a Transformer model. |
Module Contents¶
- class noether.modeling.models.transformer.TransformerConfig(/, **data)¶
Bases:
noether.core.models.base.ModelBaseConfig,noether.core.schemas.mixins.InjectSharedFieldFromParentMixinConfiguration for a Transformer model.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
data (Any)
- model_config¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
Hidden dimension of the model. Used for all transformer blocks.
- transformer_block_config: Annotated[noether.modeling.modules.blocks.transformer.TransformerBlockConfig, noether.core.schemas.mixins.Shared]¶
- class noether.modeling.models.transformer.Transformer(config)¶
Bases:
torch.nn.ModuleImplementation of a Transformer model.
- Parameters:
config (TransformerConfig) – Configuration of the Transformer model.
- blocks¶
- forward(x, attn_kwargs, condition=None)¶
Forward pass of the Transformer model.
- Parameters:
x (torch.Tensor) – Input tensor of shape (batch_size, seq_len, hidden_dim).
attn_kwargs (dict[str, torch.Tensor]) – Additional arguments for the attention mechanism.
condition (torch.Tensor | None) – Optional conditioning vector of shape (batch_size, condition_dim) consumed by each block’s AdaLN-Zero modulation.
None(default) for unconditioned models.
- Returns:
Output tensor after processing through the Transformer model.
- Return type: