noether.core.schemas.modules.attention¶
Base attention configs and back-compat re-exports for moved attention configs.
The base configs (AttentionConfig, TokenSpec,
AttentionPattern) have no
matching class and stay here. The concrete attention configs have moved next
to their matching classes in noether.modeling.modules.attention; they
are re-exported here for backward compatibility.
Classes¶
Configuration for an attention module. |
|
Specification for a token type in the attention mechanism. |
|
Defines which tokens attend to which other tokens. |
Module Contents¶
- class noether.core.schemas.modules.attention.AttentionConfig(/, **data)¶
Bases:
pydantic.BaseModelConfiguration for an attention module. Since we can have many different attention implementations, we allow extra fields. such that we can use the same schema for all attention modules.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
data (Any)
- model_config¶
Configuration for an attention module.
Dimensionality of the hidden features.
- init_weights: noether.core.types.InitWeightsMode = None¶
Weight initialization strategy.
- qk_norm: bool = None¶
Whether to apply layer normalization to the query and key features before computing attention scores.
- class noether.core.schemas.modules.attention.TokenSpec(/, **data)¶
Bases:
pydantic.BaseModelSpecification for a token type in the attention mechanism.
When
sizeisNone, the token group is not present in the input tensor and its key/value representations will be loaded from a KV cache instead.Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
data (Any)
- classmethod from_dict(token_dict)¶
Create TokenSpec from dictionary with single key-value pair.
- class noether.core.schemas.modules.attention.AttentionPattern(/, **data)¶
Bases:
pydantic.BaseModelDefines which tokens attend to which other tokens.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
data (Any)
- query_tokens: collections.abc.Sequence[str]¶
- key_value_tokens: collections.abc.Sequence[str]¶