noether.core.schemas.modules

Back-compat re-exports for noether.core.schemas.modules.

Module configs have been moved next to their matching classes in noether.modeling.modules. Base configs without a matching class (AttentionConfig, AttentionPattern, TokenSpec`stay in :mod:.attention`.

Concrete attention configs are loaded lazily via PEP 562 to avoid circular imports between the schema package and the modeling modules that depend on AttentionConfig.

Submodules

Classes

AttentionConfig

Configuration for an attention module.

AttentionPattern

Defines which tokens attend to which other tokens.

TokenSpec

Specification for a token type in the attention mechanism.

Package Contents

class noether.core.schemas.modules.AttentionConfig(/, **data)

Bases: pydantic.BaseModel

Configuration for an attention module. Since we can have many different attention implementations, we allow extra fields. such that we can use the same schema for all attention modules.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

data (Any)

model_config

Configuration for an attention module.

hidden_dim: int = None

Dimensionality of the hidden features.

num_heads: int = None

Number of attention heads.

use_rope: bool = None

Whether to use Rotary Positional Embeddings (RoPE).

dropout: float = None

Dropout rate for the attention weights and output projection.

init_weights: noether.core.types.InitWeightsMode = None

Weight initialization strategy.

bias: bool = None

Whether to use bias terms in linear layers.

head_dim: int | None = None

Dimensionality of each attention head.

qk_norm: bool = None

Whether to apply layer normalization to the query and key features before computing attention scores.

validate_hidden_dim_and_num_heads()
class noether.core.schemas.modules.AttentionPattern(/, **data)

Bases: pydantic.BaseModel

Defines which tokens attend to which other tokens.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

data (Any)

query_tokens: collections.abc.Sequence[str]
key_value_tokens: collections.abc.Sequence[str]
class noether.core.schemas.modules.TokenSpec(/, **data)

Bases: pydantic.BaseModel

Specification for a token type in the attention mechanism.

When size is None, the token group is not present in the input tensor and its key/value representations will be loaded from a KV cache instead.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

data (Any)

name: str
size: int | None = None
classmethod from_dict(token_dict)

Create TokenSpec from dictionary with single key-value pair.

Parameters:

token_dict (dict[str, int | None])

Return type:

TokenSpec

to_dict()

Convert TokenSpec to dictionary.

Return type:

dict[str, int | None]

property domain: str

Extract token domain from the name (e.g., “surface” from “surface_anchors”).

Return type:

str

property attn_type: str

Extract attention type from the name (e.g., “anchors” from “surface_anchors”).

Return type:

str