noether.core.schemas.modules¶

Back-compat re-exports for noether.core.schemas.modules.

Module configs have been moved next to their matching classes in noether.modeling.modules. Base configs without a matching class (AttentionConfig, AttentionPattern, TokenSpec`stay in :mod:.attention`.

Concrete attention configs are loaded lazily via PEP 562 to avoid circular imports between the schema package and the modeling modules that depend on AttentionConfig.

Submodules¶

Classes¶

`AttentionConfig`	Configuration for an attention module.
`AttentionPattern`	Defines which tokens attend to which other tokens.
`TokenSpec`	Specification for a token type in the attention mechanism.

Package Contents¶

class noether.core.schemas.modules.AttentionConfig(/, **data)¶

Bases: pydantic.BaseModel

Configuration for an attention module. Since we can have many different attention implementations, we allow extra fields. such that we can use the same schema for all attention modules.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:: data (Any)

model_config¶: Configuration for an attention module.

hidden_dim: int = None¶: Dimensionality of the hidden features.

num_heads: int = None¶: Number of attention heads.

use_rope: bool = None¶: Whether to use Rotary Positional Embeddings (RoPE).

dropout: float = None¶: Dropout rate for the attention weights and output projection.

init_weights: noether.core.types.InitWeightsMode = None¶: Weight initialization strategy.

bias: bool = None¶: Whether to use bias terms in linear layers.

head_dim: int | None = None¶: Dimensionality of each attention head.

qk_norm: bool = None¶: Whether to apply layer normalization to the query and key features before computing attention scores.

validate_hidden_dim_and_num_heads()¶

class noether.core.schemas.modules.AttentionPattern(/, **data)¶

Bases: pydantic.BaseModel

Defines which tokens attend to which other tokens.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:: data (Any)

query_tokens: collections.abc.Sequence[str]¶

key_value_tokens: collections.abc.Sequence[str]¶

class noether.core.schemas.modules.TokenSpec(/, **data)¶

Bases: pydantic.BaseModel

Specification for a token type in the attention mechanism.

When size is None, the token group is not present in the input tensor and its key/value representations will be loaded from a KV cache instead.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:: data (Any)

name: str¶

size: int | None = None¶

classmethod from_dict(token_dict)¶

Create TokenSpec from dictionary with single key-value pair.

Parameters:: token_dict (dict[str, int | None])
Return type:: TokenSpec

to_dict()¶

Convert TokenSpec to dictionary.

Return type:: dict[str, int | None]

property domain: str¶

Extract token domain from the name (e.g., “surface” from “surface_anchors”).

Return type:: str

property attn_type: str¶

Extract attention type from the name (e.g., “anchors” from “surface_anchors”).

Return type:: str