noether.core.schemas.modules.untied¶
Classes¶
Configuration for a linear layer with per-type (untied) weight banks. |
|
Configuration for multi-head attention with per-type (untied) QKV and output projections. |
|
Configuration for an MLP with per-type (untied) weights. |
|
Configuration for a transformer block with per-type (untied) attention and MLP weights. |
|
Configuration for a perceiver block with per-type (untied) Q/output projections and MLP weights. |
Module Contents¶
- class noether.core.schemas.modules.untied.UntiedLinearConfig(/, **data)¶
Bases:
pydantic.BaseModelConfiguration for a linear layer with per-type (untied) weight banks.
Composes a
LinearProjectionConfig(shared across types) with anum_typesfield: each token type gets its own independent weight matrix with the geometry described by the linear projection config.Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
data (Any)
- linear_projection: noether.core.schemas.modules.layers.LinearProjectionConfig¶
Shared geometry (input/output dims, bias, init) for every per-type weight bank.
- class noether.core.schemas.modules.untied.UntiedMixedAttentionConfig(/, **data)¶
Bases:
noether.core.schemas.modules.attention.MixedAttentionConfigConfiguration for multi-head attention with per-type (untied) QKV and output projections.
Extends
MixedAttentionConfigwith anum_typesfield: the QKV and output projections areUntiedLinearlayers so each token type gets its own projection weights. Attention itself is still computed across all tokens viaMixedAttention._process_pattern_batched().Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
data (Any)
- projection_config()¶
Configuration for the per-type QKV and output projections.
- Return type:
- class noether.core.schemas.modules.untied.UntiedMLPConfig(/, **data)¶
Bases:
pydantic.BaseModelConfiguration for an MLP with per-type (untied) weights.
Composes an
MLPConfig(architecture: dims, activation, init) with anum_typesfield. The untied MLP mirrorsMLP’s topology (input -> [hidden]*(num_layers+1) -> outputwith activations between layers) but usesUntiedLinearfor every linear layer.Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
data (Any)
- mlp: noether.core.schemas.modules.mlp.MLPConfig¶
Underlying MLP architecture (dims, activation, init).
- class noether.core.schemas.modules.untied.UntiedTransformerBlockConfig(/, **data)¶
Bases:
pydantic.BaseModelConfiguration for a transformer block with per-type (untied) attention and MLP weights.
Composes a
TransformerBlockConfig(shared layout: dims, heads, layer scale, drop path, etc.) with anum_typesfield. Both sub-layers have per-type weights:UntiedMultiHeadAttentionfor attention andUntiedMLPfor the feed-forward.Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
data (Any)
- transformer_block: noether.core.schemas.modules.blocks.TransformerBlockConfig¶
Shared transformer-block layout (dims, heads, layer scale, drop path, etc.).
- attention_config()¶
Configuration for the UntiedMultiHeadAttention sub-layer.
- Return type:
- untied_mlp_config()¶
Configuration for the UntiedMLP sub-layer.
- Return type:
- class noether.core.schemas.modules.untied.UntiedPerceiverBlockConfig(/, **data)¶
Bases:
pydantic.BaseModelConfiguration for a perceiver block with per-type (untied) Q/output projections and MLP weights.
Composes a
PerceiverBlockConfig(shared layout: dims, heads, layer scale, drop path, etc.) with anum_typesfield. The Q and output projections inPerceiverAttentionbecome per-type viaUntiedLinear, while the KV projection stays shared (it operates on a single geometry encoding). The MLP is also replaced withUntiedMLP.Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
data (Any)
- perceiver_block_config: noether.core.schemas.modules.blocks.PerceiverBlockConfig¶
Shared perceiver-block layout (dims, heads, kv_dim, layer scale, drop path, etc.).
- untied_mlp_config()¶
Configuration for the UntiedMLP sub-layer.
- Return type: