noether.modeling.models.ab_upt¶
Attributes¶
Classes¶
Configuration for the Anchored Branched UPT (AB-UPT) model. |
|
Caches that let a forward call skip work it has already done. |
|
The final readout layer for AB-UPT, which applies an AdaLN-style modulation followed by a linear projection to the desired output dimension. |
|
Implementation of the Anchored Branched UPT (AB-UPT) model. |
Module Contents¶
- class noether.modeling.models.ab_upt.AnchorBranchedUPTConfig(/, **data)¶
Bases:
noether.core.models.base.ModelBaseConfig,noether.core.schemas.mixins.InjectSharedFieldFromParentMixinConfiguration for the Anchored Branched UPT (AB-UPT) model.
AB-UPT is built from three configurable stages:
Geometry encoder (optional): a
SupernodePoolingencoder followed bygeometry_depthstandard transformer blocks. Only instantiated when at least oneperceiver/perceiver_untiedblock is present inphysics_blocksandsupernode_pooling_configis provided.Physics trunk: a stack of blocks listed in
physics_blocksoperating on per-domain anchor (and optionally query) tokens. The block string controls the attention pattern and weight sharing — seephysics_blocksbelow.Per-domain decoder (optional):
num_domain_decoder_blocks[name]self-attention blocks with untied weights per domain, followed by a linear projection to that domain’s output fields.
hidden_dimis a shared field — it is auto-injected intotransformer_block_configandsupernode_pooling_configviaInjectSharedFieldFromParentMixin, so it only needs to be set once at the top level. See Configuration Inheritance.Configuration guide¶
See Configuring AB-UPT for a step-by-step walkthrough of how to compose physics blocks, choose between tied and
_untiedvariants, and wire up the per-domain decoder.Concrete examples (YAML):
Aerodynamics (multi-domain, surface + volume): recipes/aero_cfd/configs/model/ab_upt.yaml
Heat transfer (single-domain, volume only with parameter conditioning): recipes/heat_transfer/configs/model/ab_upt.yaml
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- kind: str | None = 'noether.core.schemas.models.AnchorBranchedUPTConfig'¶
Kind of model to use, i.e. class path
- model_config¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- supernode_pooling_config: Annotated[noether.modeling.modules.encoders.supernode_pooling.SupernodePoolingConfig, noether.core.schemas.mixins.Shared] | None = None¶
- transformer_block_config: Annotated[noether.modeling.modules.blocks.transformer.TransformerBlockConfig, noether.core.schemas.mixins.Shared]¶
Hidden dimension of the model.
- physics_blocks: list[Literal['self', 'shared', 'cross', 'joint', 'perceiver', 'self_untied', 'cross_untied', 'joint_untied', 'perceiver_untied']]¶
Types of physics blocks to use in the model.
self/shared: Self-attention within a branch/domain. Weights are shared between all domains. cross: Cross-attention between domains. Each domain attends to all other domains’ anchors, weights are shared. joint: Joint attention over all domain points. Full self-attention over all points, weights are shared. perceiver: Perceiver-style cross-attention to geometry encoding. self_untied: Self-attention within a branch with untied weights for each domain. cross_untied: Cross-attention between domains with untied weights for each domain. joint_untied: Joint attention over all domain points with untied weights for each domain. perceiver_untied: Perceiver cross-attention with geometry encoding and untied weights per domain.
Note: “shared” is a deprecated alias for “self” and will be removed in a future release.
- num_domain_decoder_blocks: dict[str, int] = None¶
2, “volume”: 2}.
- Type:
Number of final domain-specific decoder blocks with self attention and no weight sharing, e.g. {“surface”
- init_weights: noether.core.types.InitWeightsMode = None¶
Weight initialization of linear layers. Defaults to “truncnormal002”.
- geometry_conditioning_dims: noether.data.schemas.FieldDimSpec | None = None¶
Per-named-field conditioning spec for geometry transformer blocks. When left unset, defaults to
data_specs.conditioning_dimsso the geometry branch sees the same conditioning as the rest of the model. An explicit emptyFieldDimSpec(total_dim == 0) opts out — useful for diffusion, where timestep modulation should touch physics + per-domain decoders but not the geometry branch (geometry is invariant to noise level).
- data_specs: noether.data.schemas.ModelDataSpecs¶
Data specifications for the model.
Migrate deprecated ‘shared’ block type to ‘self’.
- Return type:
- rope_frequency_config()¶
- pos_embed_config()¶
- bias_mlp_config()¶
- Return type:
- perceiver_block_config()¶
- domain_decoder_configs()¶
Per-domain decoder projection configs, keyed by domain name.
- conditioner_config()¶
Configuration for the scalar conditioner module.
- set_condition_dim()¶
Set condition_dim in transformer_block_config based on data_specs.
- Return type:
- geometry_transformer_block_config()¶
Transformer block config for geometry encoder, with condition_dim set to geometry_conditioning_dims.
- geometry_conditioner_config()¶
Configuration for the scalar conditioner module.
- validate_parameters()¶
Validate validity of parameters across the model and its submodules.
Ensures that hidden_dim is consistent across parent and all submodules. Note: transformer_block_config validates hidden_dim % num_heads == 0 in its own validator.
- Return type:
- Parameters:
data (Any)
- noether.modeling.models.ab_upt.KVPair¶
- noether.modeling.models.ab_upt.LayerCache¶
- class noether.modeling.models.ab_upt.ModelKVCache¶
Bases:
TypedDictCaches that let a forward call skip work it has already done.
All keys are independent — pass any subset.
geometry_encodingandgeometry_ropetravel together (both depend only on the geometry mesh) and are what diffusion sampling reuses across Euler steps.geometry_encoding— output ofAnchoredBranchedUPT.geometry_branch_forward(). When present, the geometry encoder + transformer stack are skipped.geometry_rope— RoPE of the geometry supernode positions, used by perceiver blocks ask_freqs. Cached alongsidegeometry_encodingso subsequent calls don’t needgeometry_position.physics_blocks— per-block anchor self-attention K/V (Nonefor perceiver blocks, which always re-project fromgeometry_encoding). When present, anchors are not re-encoded; the call is decode-only.decoders— per-domain anchor K/V for each decoder block. Symmetric withphysics_blocksfor the per-domain decoder stage.
Initialize self. See help(type(self)) for accurate signature.
- geometry_encoding: torch.Tensor¶
- geometry_rope: torch.Tensor¶
- class noether.modeling.models.ab_upt.ReadoutLayer(decoder_config, hidden_dim, condition_dim=None)¶
Bases:
torch.nn.ModuleThe final readout layer for AB-UPT, which applies an AdaLN-style modulation followed by a linear projection to the desired output dimension.
- Parameters:
decoder_config (noether.modeling.modules.layers.linear_projection.LinearProjectionConfig)
hidden_dim (int)
condition_dim (int | None)
- norm_final¶
- linear¶
- modulation = None¶
- forward(x, condition)¶
- Parameters:
x (torch.Tensor)
condition (torch.Tensor | None)
- Return type:
- class noether.modeling.models.ab_upt.AnchoredBranchedUPT(config)¶
Bases:
torch.nn.ModuleImplementation of the Anchored Branched UPT (AB-UPT) model.
This is an off-the-shelf model — it includes input embedding and output projection, so it can be used directly by providing the appropriate input tensors. See
forward()for the expected inputs.The architecture is fully driven by
AnchorBranchedUPTConfig: the geometry encoder depth, the ordering and type of physics blocks, and the per-domain decoder depths are all configured there. For a walkthrough of how to assemble a config (and concrete YAML examples from theaero_cfdandheat_transferrecipes), see Configuring AB-UPT.- Parameters:
config (AnchorBranchedUPTConfig) – Configuration for the AB-UPT model. See
AnchorBranchedUPTConfigfor details.
- data_specs¶
- rope¶
- pos_embed¶
- domain_biases¶
- physics_blocks¶
- use_geometry_branch = False¶
- domain_feature_projs: torch.nn.ModuleDict | None = None¶
- domain_decoder_blocks¶
- domain_decoder_projections¶
- geometry_branch_forward(geometry_position, geometry_supernode_idx, geometry_batch_idx, condition, geometry_attn_kwargs)¶
Forward pass through the geometry branch of the model.
- Parameters:
geometry_position (torch.Tensor)
geometry_supernode_idx (torch.Tensor)
geometry_batch_idx (torch.Tensor)
condition (torch.Tensor | None)
geometry_attn_kwargs (dict[str, torch.Tensor])
- Return type:
- build_physics_input(domain_anchor_positions=None, domain_query_positions=None, domain_anchor_features=None, domain_query_features=None)¶
Build the physics-block input tensor and combined per-domain positions.
Each per-domain segment is
[anchors | queries]with positional biases plus projected features (whendata_specs.domains[name].feature_dimwas set on the config). Domains are concatenated inself.domain_namesorder.- Returns:
Tuple of (x_physics, physics_positions).
x_physicshas shape(B, total_tokens, hidden_dim).physics_positionsmaps each domain name to its concatenated[anchors | queries]positions and can be passed directly tocreate_rope_frequencies().- Parameters:
domain_anchor_positions (dict[str, torch.Tensor] | None)
domain_query_positions (dict[str, torch.Tensor] | None)
domain_anchor_features (dict[str, torch.Tensor] | None)
domain_query_features (dict[str, torch.Tensor] | None)
- Return type:
- physics_blocks_forward(x_physics, geometry_encoding, physics_token_specs, physics_attn_kwargs, physics_perceiver_attn_kwargs, condition, physics_blocks_cache=None)¶
Run the physics-block stack on a pre-built input tensor.
Perceiver blocks always re-project K/V from
geometry_encodingand contributeNoneto the returned cache; only transformer blocks cache their anchor self-attention K/V.- Parameters:
x_physics (torch.Tensor)
geometry_encoding (torch.Tensor | None)
physics_token_specs (list[noether.core.schemas.modules.attention.TokenSpec])
condition (torch.Tensor | None)
physics_blocks_cache (list[LayerCache | None] | None)
- Return type:
tuple[torch.Tensor, list[LayerCache | None]]
- decoder_blocks_forward(x_physics, physics_token_specs, per_domain_token_specs, decoder_attn_kwargs, condition, decoders_cache=None)¶
Forward pass through the per-domain decoder blocks.
- Returns:
Tuple of (domain_predictions, new_domain_caches).
- Parameters:
x_physics (torch.Tensor)
physics_token_specs (list[noether.core.schemas.modules.attention.TokenSpec])
per_domain_token_specs (dict[str, list[noether.core.schemas.modules.attention.TokenSpec]])
condition (torch.Tensor | None)
- Return type:
- create_rope_frequencies(physics_positions, geometry_position=None, geometry_supernode_idx=None, geometry_rope=None)¶
Create RoPE frequencies for all relevant positions.
- Parameters:
physics_positions (dict[str, torch.Tensor]) – Per-domain combined
[anchors | queries]positions, as returned bybuild_physics_input().geometry_position (torch.Tensor | None) – Geometry mesh coordinates (optional).
geometry_supernode_idx (torch.Tensor | None) – Geometry supernode indices (optional).
geometry_rope (torch.Tensor | None) – Precomputed geometry-supernode RoPE. When provided, bypasses
geometry_position/geometry_supernode_idxfor the perceiverk_freqs(needed in queries-only mode where geometry inputs aren’t available).
- Returns:
Tuple of (geometry_attn_kwargs, decoder_attn_kwargs, physics_perceiver_attn_kwargs, physics_attn_kwargs, geometry_rope).
geometry_ropeis the rope tensor used / computed (orNonewhen there’s no geometry branch).- Return type:
tuple[dict[str, Any], dict[str, dict[str, Any]], dict[str, Any], dict[str, Any], torch.Tensor | None]
- forward(geometry_position=None, geometry_supernode_idx=None, geometry_batch_idx=None, domain_anchor_positions=None, domain_query_positions=None, domain_anchor_features=None, domain_query_features=None, conditioning_inputs=None, geometry_conditioning_inputs=None, kv_cache=None)¶
Forward pass of the AB-UPT model.
Example:
model( geometry_position=..., geometry_supernode_idx=..., geometry_batch_idx=..., domain_anchor_positions={"surface": surface_pos, "volume": volume_pos}, domain_query_positions={"surface": query_pos}, domain_anchor_features={"surface": surface_features, "volume": volume_features}, domain_query_features={"surface": query_features}, conditioning_inputs={"geometry_design_parameters": design_params}, )
- Parameters:
geometry_position (torch.Tensor | None) – Coordinates of the geometry mesh. Tensor of shape (B * N_geometry, D_pos).
geometry_supernode_idx (torch.Tensor | None) – Supernode indices for the geometry points.
geometry_batch_idx (torch.Tensor | None) – Batch indices for the geometry points.
domain_anchor_positions (dict[str, torch.Tensor] | None) – Per-domain anchor positions, e.g.
{"surface": (B, N, D), "volume": (B, M, D)}.domain_query_positions (dict[str, torch.Tensor] | None) – Per-domain query positions (optional).
domain_anchor_features (dict[str, torch.Tensor] | None) – Per-domain anchor input features (optional), matching the shape of
domain_anchor_positions.domain_query_features (dict[str, torch.Tensor] | None) – Per-domain query input features (optional), matching the shape of
domain_query_positions.conditioning_inputs (dict[str, torch.Tensor] | None) – Conditioning tensors for physics + decoder blocks, e.g.
{"geometry_design_parameters": (B, D)}.geometry_conditioning_inputs (dict[str, torch.Tensor] | None) – Conditioning tensors for the geometry branch. When
Noneandconditioning_inputsis set, the geometry branch automatically reusesconditioning_inputsif the configuredgeometry_conditioning_dimsmatchesdata_specs.conditioning_dims(the common case). Pass an explicit dict to feed a different conditioning to geometry, or leave itNonewhen the geometry branch is unconditioned.kv_cache (ModelKVCache | None) – KV cache from a previous forward call.
- Returns:
Tuple of (predictions, kv_cache).
- Return type: