noether.modeling.models¶
Submodules¶
Classes¶
Configuration for the Anchored Branched UPT (AB-UPT) model. |
|
Implementation of the Anchored Branched UPT (AB-UPT) model. |
|
Aerodynamic Anchored-Branched UPT wrapper. |
|
Aerodynamic Transformer wrapper. |
|
Transformer config extended with aerodynamic data specifications. |
|
Aerodynamic Transolver wrapper. |
|
Transolver config extended with aerodynamic data specifications. |
|
Aerodynamic UPT wrapper. |
|
Implementation of a Transformer model. |
|
Configuration for a Transformer model. |
|
Implementation of the Transolver model. |
|
Configuration for a Transolver model. |
|
Configuration for a Transolver++ model. |
|
Implementation of the UPT (Universal Physics Transformer) model. |
|
Configuration for a UPT model. |
|
Vision Transformer for spatial regression on continuous-coordinate grids. |
|
Configuration for ViT model |
Package Contents¶
- class noether.modeling.models.AnchorBranchedUPTConfig(/, **data)¶
Bases:
noether.core.models.base.ModelBaseConfig,noether.core.schemas.mixins.InjectSharedFieldFromParentMixinConfiguration for the Anchored Branched UPT (AB-UPT) model.
AB-UPT is built from three configurable stages:
Geometry encoder (optional): a
SupernodePoolingencoder followed bygeometry_depthstandard transformer blocks. Only instantiated when at least oneperceiver/perceiver_untiedblock is present inphysics_blocksandsupernode_pooling_configis provided.Physics trunk: a stack of blocks listed in
physics_blocksoperating on per-domain anchor (and optionally query) tokens. The block string controls the attention pattern and weight sharing — seephysics_blocksbelow.Per-domain decoder (optional):
num_domain_decoder_blocks[name]self-attention blocks with untied weights per domain, followed by a linear projection to that domain’s output fields.
hidden_dimis a shared field — it is auto-injected intotransformer_block_configandsupernode_pooling_configviaInjectSharedFieldFromParentMixin, so it only needs to be set once at the top level. See Configuration Inheritance.Configuration guide¶
See Configuring AB-UPT for a step-by-step walkthrough of how to compose physics blocks, choose between tied and
_untiedvariants, and wire up the per-domain decoder.Concrete examples (YAML):
Aerodynamics (multi-domain, surface + volume): recipes/aero_cfd/configs/model/ab_upt.yaml
Heat transfer (single-domain, volume only with parameter conditioning): recipes/heat_transfer/configs/model/ab_upt.yaml
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- kind: str | None = 'noether.core.schemas.models.AnchorBranchedUPTConfig'¶
Kind of model to use, i.e. class path
- model_config¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- supernode_pooling_config: Annotated[noether.modeling.modules.encoders.supernode_pooling.SupernodePoolingConfig, noether.core.schemas.mixins.Shared] | None = None¶
- transformer_block_config: Annotated[noether.modeling.modules.blocks.transformer.TransformerBlockConfig, noether.core.schemas.mixins.Shared]¶
Hidden dimension of the model.
- physics_blocks: list[Literal['self', 'shared', 'cross', 'joint', 'perceiver', 'self_untied', 'cross_untied', 'joint_untied', 'perceiver_untied']]¶
Types of physics blocks to use in the model.
self/shared: Self-attention within a branch/domain. Weights are shared between all domains. cross: Cross-attention between domains. Each domain attends to all other domains’ anchors, weights are shared. joint: Joint attention over all domain points. Full self-attention over all points, weights are shared. perceiver: Perceiver-style cross-attention to geometry encoding. self_untied: Self-attention within a branch with untied weights for each domain. cross_untied: Cross-attention between domains with untied weights for each domain. joint_untied: Joint attention over all domain points with untied weights for each domain. perceiver_untied: Perceiver cross-attention with geometry encoding and untied weights per domain.
Note: “shared” is a deprecated alias for “self” and will be removed in a future release.
- num_domain_decoder_blocks: dict[str, int] = None¶
2, “volume”: 2}.
- Type:
Number of final domain-specific decoder blocks with self attention and no weight sharing, e.g. {“surface”
- init_weights: noether.core.types.InitWeightsMode = None¶
Weight initialization of linear layers. Defaults to “truncnormal002”.
- geometry_conditioning_dims: noether.data.schemas.FieldDimSpec | None = None¶
Per-named-field conditioning spec for geometry transformer blocks. When left unset, defaults to
data_specs.conditioning_dimsso the geometry branch sees the same conditioning as the rest of the model. An explicit emptyFieldDimSpec(total_dim == 0) opts out — useful for diffusion, where timestep modulation should touch physics + per-domain decoders but not the geometry branch (geometry is invariant to noise level).
- data_specs: noether.data.schemas.ModelDataSpecs¶
Data specifications for the model.
Migrate deprecated ‘shared’ block type to ‘self’.
- Return type:
- rope_frequency_config()¶
- pos_embed_config()¶
- bias_mlp_config()¶
- Return type:
- perceiver_block_config()¶
- domain_decoder_configs()¶
Per-domain decoder projection configs, keyed by domain name.
- conditioner_config()¶
Configuration for the scalar conditioner module.
- set_condition_dim()¶
Set condition_dim in transformer_block_config based on data_specs.
- Return type:
- geometry_transformer_block_config()¶
Transformer block config for geometry encoder, with condition_dim set to geometry_conditioning_dims.
- geometry_conditioner_config()¶
Configuration for the scalar conditioner module.
- validate_parameters()¶
Validate validity of parameters across the model and its submodules.
Ensures that hidden_dim is consistent across parent and all submodules. Note: transformer_block_config validates hidden_dim % num_heads == 0 in its own validator.
- Return type:
- Parameters:
data (Any)
- class noether.modeling.models.AnchoredBranchedUPT(config)¶
Bases:
torch.nn.ModuleImplementation of the Anchored Branched UPT (AB-UPT) model.
This is an off-the-shelf model — it includes input embedding and output projection, so it can be used directly by providing the appropriate input tensors. See
forward()for the expected inputs.The architecture is fully driven by
AnchorBranchedUPTConfig: the geometry encoder depth, the ordering and type of physics blocks, and the per-domain decoder depths are all configured there. For a walkthrough of how to assemble a config (and concrete YAML examples from theaero_cfdandheat_transferrecipes), see Configuring AB-UPT.- Parameters:
config (AnchorBranchedUPTConfig) – Configuration for the AB-UPT model. See
AnchorBranchedUPTConfigfor details.
- data_specs¶
- rope¶
- pos_embed¶
- domain_biases¶
- physics_blocks¶
- use_geometry_branch = False¶
- domain_feature_projs: torch.nn.ModuleDict | None = None¶
- domain_decoder_blocks¶
- domain_decoder_projections¶
- geometry_branch_forward(geometry_position, geometry_supernode_idx, geometry_batch_idx, condition, geometry_attn_kwargs)¶
Forward pass through the geometry branch of the model.
- Parameters:
geometry_position (torch.Tensor)
geometry_supernode_idx (torch.Tensor)
geometry_batch_idx (torch.Tensor)
condition (torch.Tensor | None)
geometry_attn_kwargs (dict[str, torch.Tensor])
- Return type:
- build_physics_input(domain_anchor_positions=None, domain_query_positions=None, domain_anchor_features=None, domain_query_features=None)¶
Build the physics-block input tensor and combined per-domain positions.
Each per-domain segment is
[anchors | queries]with positional biases plus projected features (whendata_specs.domains[name].feature_dimwas set on the config). Domains are concatenated inself.domain_namesorder.- Returns:
Tuple of (x_physics, physics_positions).
x_physicshas shape(B, total_tokens, hidden_dim).physics_positionsmaps each domain name to its concatenated[anchors | queries]positions and can be passed directly tocreate_rope_frequencies().- Parameters:
domain_anchor_positions (dict[str, torch.Tensor] | None)
domain_query_positions (dict[str, torch.Tensor] | None)
domain_anchor_features (dict[str, torch.Tensor] | None)
domain_query_features (dict[str, torch.Tensor] | None)
- Return type:
- physics_blocks_forward(x_physics, geometry_encoding, physics_token_specs, physics_attn_kwargs, physics_perceiver_attn_kwargs, condition, physics_blocks_cache=None)¶
Run the physics-block stack on a pre-built input tensor.
Perceiver blocks always re-project K/V from
geometry_encodingand contributeNoneto the returned cache; only transformer blocks cache their anchor self-attention K/V.- Parameters:
x_physics (torch.Tensor)
geometry_encoding (torch.Tensor | None)
physics_token_specs (list[noether.core.schemas.modules.attention.TokenSpec])
condition (torch.Tensor | None)
physics_blocks_cache (list[LayerCache | None] | None)
- Return type:
tuple[torch.Tensor, list[LayerCache | None]]
- decoder_blocks_forward(x_physics, physics_token_specs, per_domain_token_specs, decoder_attn_kwargs, condition, decoders_cache=None)¶
Forward pass through the per-domain decoder blocks.
- Returns:
Tuple of (domain_predictions, new_domain_caches).
- Parameters:
x_physics (torch.Tensor)
physics_token_specs (list[noether.core.schemas.modules.attention.TokenSpec])
per_domain_token_specs (dict[str, list[noether.core.schemas.modules.attention.TokenSpec]])
condition (torch.Tensor | None)
- Return type:
- create_rope_frequencies(physics_positions, geometry_position=None, geometry_supernode_idx=None, geometry_rope=None)¶
Create RoPE frequencies for all relevant positions.
- Parameters:
physics_positions (dict[str, torch.Tensor]) – Per-domain combined
[anchors | queries]positions, as returned bybuild_physics_input().geometry_position (torch.Tensor | None) – Geometry mesh coordinates (optional).
geometry_supernode_idx (torch.Tensor | None) – Geometry supernode indices (optional).
geometry_rope (torch.Tensor | None) – Precomputed geometry-supernode RoPE. When provided, bypasses
geometry_position/geometry_supernode_idxfor the perceiverk_freqs(needed in queries-only mode where geometry inputs aren’t available).
- Returns:
Tuple of (geometry_attn_kwargs, decoder_attn_kwargs, physics_perceiver_attn_kwargs, physics_attn_kwargs, geometry_rope).
geometry_ropeis the rope tensor used / computed (orNonewhen there’s no geometry branch).- Return type:
tuple[dict[str, Any], dict[str, dict[str, Any]], dict[str, Any], dict[str, Any], torch.Tensor | None]
- forward(geometry_position=None, geometry_supernode_idx=None, geometry_batch_idx=None, domain_anchor_positions=None, domain_query_positions=None, domain_anchor_features=None, domain_query_features=None, conditioning_inputs=None, geometry_conditioning_inputs=None, kv_cache=None)¶
Forward pass of the AB-UPT model.
Example:
model( geometry_position=..., geometry_supernode_idx=..., geometry_batch_idx=..., domain_anchor_positions={"surface": surface_pos, "volume": volume_pos}, domain_query_positions={"surface": query_pos}, domain_anchor_features={"surface": surface_features, "volume": volume_features}, domain_query_features={"surface": query_features}, conditioning_inputs={"geometry_design_parameters": design_params}, )
- Parameters:
geometry_position (torch.Tensor | None) – Coordinates of the geometry mesh. Tensor of shape (B * N_geometry, D_pos).
geometry_supernode_idx (torch.Tensor | None) – Supernode indices for the geometry points.
geometry_batch_idx (torch.Tensor | None) – Batch indices for the geometry points.
domain_anchor_positions (dict[str, torch.Tensor] | None) – Per-domain anchor positions, e.g.
{"surface": (B, N, D), "volume": (B, M, D)}.domain_query_positions (dict[str, torch.Tensor] | None) – Per-domain query positions (optional).
domain_anchor_features (dict[str, torch.Tensor] | None) – Per-domain anchor input features (optional), matching the shape of
domain_anchor_positions.domain_query_features (dict[str, torch.Tensor] | None) – Per-domain query input features (optional), matching the shape of
domain_query_positions.conditioning_inputs (dict[str, torch.Tensor] | None) – Conditioning tensors for physics + decoder blocks, e.g.
{"geometry_design_parameters": (B, D)}.geometry_conditioning_inputs (dict[str, torch.Tensor] | None) – Conditioning tensors for the geometry branch. When
Noneandconditioning_inputsis set, the geometry branch automatically reusesconditioning_inputsif the configuredgeometry_conditioning_dimsmatchesdata_specs.conditioning_dims(the common case). Pass an explicit dict to feed a different conditioning to geometry, or leave itNonewhen the geometry branch is unconditioned.kv_cache (ModelKVCache | None) – KV cache from a previous forward call.
- Returns:
Tuple of (predictions, kv_cache).
- Return type:
- class noether.modeling.models.AeroABUPT(model_config, **kwargs)¶
Bases:
noether.core.models.model.ModelAerodynamic Anchored-Branched UPT wrapper.
Bridges the factory’s
(config, **kwargs)instantiation pattern to the core model. Converts flat kwargs (surface_anchor_position,volume_anchor_position, …) into the domain-dict format expected byAnchoredBranchedUPT.Base class for single models, i.e. one model with one optimizer as opposed to CompositeModel.
- Parameters:
model_config (noether.modeling.models.ab_upt.AnchorBranchedUPTConfig) – Model configuration. See
ModelBaseConfigfor available options.update_counter – The
UpdateCounterprovided to the optimizer.is_frozen – If true, will set requires_grad of all parameters to false. Will also put the model into eval mode (e.g., to put Dropout or BatchNorm into eval mode).
path_provider –
PathProviderused by the initializer to store or retrieve checkpoints.data_container –
DataContainerwhich includes the data and dataloader. This is currently unused but helpful for quick prototyping only, evaluating forward in debug mode, etc.
- backbone¶
- forward(**kwargs)¶
- Return type:
- class noether.modeling.models.AeroTransformer(model_config, **kwargs)¶
Bases:
noether.core.models.model.ModelAerodynamic Transformer wrapper.
End-to-end forward for aero CFD: positional encoding, optional RoPE, optional physics features, surface/volume bias, Transformer backbone, output projection, and output gathering.
Base class for single models, i.e. one model with one optimizer as opposed to CompositeModel.
- Parameters:
model_config (AeroTransformerConfig) – Model configuration. See
ModelBaseConfigfor available options.update_counter – The
UpdateCounterprovided to the optimizer.is_frozen – If true, will set requires_grad of all parameters to false. Will also put the model into eval mode (e.g., to put Dropout or BatchNorm into eval mode).
path_provider –
PathProviderused by the initializer to store or retrieve checkpoints.data_container –
DataContainerwhich includes the data and dataloader. This is currently unused but helpful for quick prototyping only, evaluating forward in debug mode, etc.
- data_specs¶
- use_rope¶
- pos_embed¶
- surface_bias¶
- volume_bias¶
- use_physics_features¶
- backbone¶
- norm¶
- out¶
- forward(surface_position, volume_position, surface_features=None, volume_features=None)¶
- Parameters:
surface_position (torch.Tensor)
volume_position (torch.Tensor)
surface_features (torch.Tensor | None)
volume_features (torch.Tensor | None)
- Return type:
- class noether.modeling.models.AeroTransformerConfig(/, **data)¶
Bases:
noether.modeling.models.transformer.TransformerConfigTransformer config extended with aerodynamic data specifications.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
data (Any)
- data_specs: noether.data.schemas.ModelDataSpecs¶
- class noether.modeling.models.AeroTransolver(model_config, **kwargs)¶
Bases:
noether.core.models.model.ModelAerodynamic Transolver wrapper.
Like
AeroTransformerbut adds the Transolver-specific learnable placeholder parameter.Base class for single models, i.e. one model with one optimizer as opposed to CompositeModel.
- Parameters:
model_config (AeroTransolverConfig) – Model configuration. See
ModelBaseConfigfor available options.update_counter – The
UpdateCounterprovided to the optimizer.is_frozen – If true, will set requires_grad of all parameters to false. Will also put the model into eval mode (e.g., to put Dropout or BatchNorm into eval mode).
path_provider –
PathProviderused by the initializer to store or retrieve checkpoints.data_container –
DataContainerwhich includes the data and dataloader. This is currently unused but helpful for quick prototyping only, evaluating forward in debug mode, etc.
- data_specs¶
- pos_embed¶
- surface_bias¶
- volume_bias¶
- use_physics_features¶
- placeholder¶
- backbone¶
- norm¶
- out¶
- forward(surface_position, volume_position, surface_features=None, volume_features=None)¶
- Parameters:
surface_position (torch.Tensor)
volume_position (torch.Tensor)
surface_features (torch.Tensor | None)
volume_features (torch.Tensor | None)
- Return type:
- class noether.modeling.models.AeroTransolverConfig(/, **data)¶
Bases:
noether.modeling.models.transolver.TransolverConfigTransolver config extended with aerodynamic data specifications.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
data (Any)
- data_specs: noether.data.schemas.ModelDataSpecs¶
- class noether.modeling.models.AeroUPT(model_config, **kwargs)¶
Bases:
noether.core.models.model.ModelAerodynamic UPT wrapper.
Combines separate surface/volume query positions into the single
query_positionthat the core UPT expects, and splits outputs usingModelDataSpecs. Supports optional surface/volume bias layers and physics feature projection on queries.Base class for single models, i.e. one model with one optimizer as opposed to CompositeModel.
- Parameters:
model_config (noether.modeling.models.upt.UPTConfig) – Model configuration. See
ModelBaseConfigfor available options.update_counter – The
UpdateCounterprovided to the optimizer.is_frozen – If true, will set requires_grad of all parameters to false. Will also put the model into eval mode (e.g., to put Dropout or BatchNorm into eval mode).
path_provider –
PathProviderused by the initializer to store or retrieve checkpoints.data_container –
DataContainerwhich includes the data and dataloader. This is currently unused but helpful for quick prototyping only, evaluating forward in debug mode, etc.
- backbone¶
- data_specs¶
- use_bias_layers¶
- use_physics_features¶
- forward(surface_position_batch_idx, surface_position_supernode_idx, surface_position, surface_query_position, volume_query_position, surface_query_features=None, volume_query_features=None)¶
- Parameters:
surface_position_batch_idx (torch.Tensor)
surface_position_supernode_idx (torch.Tensor)
surface_position (torch.Tensor)
surface_query_position (torch.Tensor)
volume_query_position (torch.Tensor)
surface_query_features (torch.Tensor | None)
volume_query_features (torch.Tensor | None)
- Return type:
- class noether.modeling.models.Transformer(config)¶
Bases:
torch.nn.ModuleImplementation of a Transformer model.
- Parameters:
config (TransformerConfig) – Configuration of the Transformer model.
- blocks¶
- forward(x, attn_kwargs, condition=None)¶
Forward pass of the Transformer model.
- Parameters:
x (torch.Tensor) – Input tensor of shape (batch_size, seq_len, hidden_dim).
attn_kwargs (dict[str, torch.Tensor]) – Additional arguments for the attention mechanism.
condition (torch.Tensor | None) – Optional conditioning vector of shape (batch_size, condition_dim) consumed by each block’s AdaLN-Zero modulation.
None(default) for unconditioned models.
- Returns:
Output tensor after processing through the Transformer model.
- Return type:
- class noether.modeling.models.TransformerConfig(/, **data)¶
Bases:
noether.core.models.base.ModelBaseConfig,noether.core.schemas.mixins.InjectSharedFieldFromParentMixinConfiguration for a Transformer model.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
data (Any)
- model_config¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
Hidden dimension of the model. Used for all transformer blocks.
- transformer_block_config: Annotated[noether.modeling.modules.blocks.transformer.TransformerBlockConfig, noether.core.schemas.mixins.Shared]¶
- class noether.modeling.models.Transolver(config)¶
Bases:
noether.modeling.models.transformer.TransformerImplementation of the Transolver model. Reference code: https://github.com/thuml/Transolver/ Paper: https://arxiv.org/abs/2402.02366 Transolver is a Transformer with a special physics attention mechanism. Hence, we extend the Transformer class, and configure it accordingly.
- Parameters:
config (TransolverConfig) – Configuration of the Transolver model.
- class noether.modeling.models.TransolverConfig(/, **data)¶
Bases:
noether.modeling.models.transformer.TransformerConfig,noether.core.models.base.ModelBaseConfigConfiguration for a Transolver model.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
data (Any)
- model_config¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- set_attention_constructor()¶
Set attention_constructor in transformer_block_config based on data_specs.
- Return type:
- class noether.modeling.models.TransolverPlusPlusConfig(/, **data)¶
Bases:
TransolverConfigConfiguration for a Transolver++ model.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
data (Any)
- set_attention_constructor()¶
Set attention_constructor in transformer_block_config based on data_specs.
- Return type:
- class noether.modeling.models.UPT(config)¶
Bases:
torch.nn.ModuleImplementation of the UPT (Universal Physics Transformer) model.
- Parameters:
config (UPTConfig) – Configuration for the UPT model. See
UPTConfigfor details.
- use_rope¶
- encoder¶
- pos_embed¶
- approximator_blocks¶
- decoder¶
- norm¶
- prediction_layer¶
- compute_rope_args(geometry_batch_idx, geometry_position, geometry_supernode_idx, query_position)¶
Compute the RoPE frequency arguments for the geometry and query positions. If RoPE is not used, return empty dicts.
- Parameters:
geometry_batch_idx (torch.Tensor)
geometry_position (torch.Tensor)
geometry_supernode_idx (torch.Tensor)
query_position (torch.Tensor)
- Return type:
tuple[dict[str, torch.Tensor], dict[str, torch.Tensor]]
- forward(geometry_batch_idx, geometry_supernode_idx, geometry_position, query_position)¶
Forward pass of the UPT model.
- Parameters:
geometry_batch_idx (torch.Tensor) – Batch indices for the geometry positions.
geometry_supernode_idx (torch.Tensor) – Supernode indices for the geometry positions.
geometry_position (torch.Tensor) – Input coordinates of the geometry mesh points.
query_position (torch.Tensor) – Input coordinates of the query points.
- Returns:
Output tensor containing the predictions at query positions.
- Return type:
- class noether.modeling.models.UPTConfig(/, **data)¶
Bases:
noether.core.models.base.ModelBaseConfig,noether.core.schemas.mixins.InjectSharedFieldFromParentMixinConfiguration for a UPT model.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
data (Any)
- model_config¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
Hidden dimension of the model.
- supernode_pooling_config: Annotated[noether.modeling.modules.encoders.supernode_pooling.SupernodePoolingConfig, noether.core.schemas.mixins.Shared]¶
- approximator_config: Annotated[noether.modeling.modules.blocks.transformer.TransformerBlockConfig, noether.core.schemas.mixins.Shared]¶
- decoder_config: Annotated[noether.modeling.modules.decoders.deep_perceiver.DeepPerceiverDecoderConfig, noether.core.schemas.mixins.Shared]¶
- data_specs: noether.data.schemas.ModelDataSpecs¶
- linear_output_projection_config()¶
- rope_frequency_config()¶
- validate_rope_usage()¶
Ensure that if use_rope is True in the main config, it is also True in the approximator_config.
- Return type:
- pos_embedding_config()¶
- class noether.modeling.models.ViT(config)¶
Bases:
torch.nn.ModuleVision Transformer for spatial regression on continuous-coordinate grids.
Based on the ViT paper (https://arxiv.org/pdf/2010.11929) with several modifications, such as:
Continuous coordinate inputs with sincos positional embedding and RoPE (vs. learned 1D position embeddings).
Optional AdaLN-Zero conditioning, à la DiT (https://arxiv.org/abs/2212.09748).
RMSNorm and QK-norm in attention (vs. LayerNorm only).
- Parameters:
config (ViTConfig) – Configuration for the ViT model. See
ViTConfigfor available options.
- coord_dim¶
- out_channels¶
- patch_size¶
- num_heads¶
- token_dropout¶
- use_conditioning¶
- pool_patch¶
- mask_patchify¶
- pos_embedding¶
- rope¶
- backbone¶
- use_conv_output_head¶
- initialize_weights()¶
Initialize backbone weights
- Return type:
None
- unpatchify(x, grid_h, grid_w)¶
Linear unpatchify:
(B, L, p²·C_out) → (B, H, W, C_out).- Parameters:
x (torch.Tensor)
grid_h (int)
grid_w (int)
- Return type:
- forward(x, coords, mask=None, cond=None, return_tokens=False)¶
Run the standard ViT.
- Parameters:
x (torch.Tensor | None) – Optional pre-computed patch embeddings of shape
(B, L, hidden_dim). WhenNone, tokens come purely from positional encoding.coords (torch.Tensor) – Per-cell coordinates of shape
(B, H, W, coord_dim).mask (torch.Tensor | None) – Optional per-cell fluid mask of shape
(B, H, W).cond (torch.Tensor | None) – AdaLN conditioning vector of shape
(B, hidden_dim). Required when the ViT was built withuse_conditioning=True(the default); must beNoneotherwise.return_tokens (bool) – If True, return raw post-FinalLayer tokens plus
(grid_h, grid_w)instead of the decoded spatial output.
- Returns:
Either
(B, H, W, out_channels)or(tokens, (grid_h, grid_w))ifreturn_tokens.- Return type:
torch.Tensor | tuple[torch.Tensor, tuple[int, int]]
- class noether.modeling.models.ViTConfig(/, **data)¶
Bases:
noether.core.models.base.ModelBaseConfigConfiguration for ViT model
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
data (Any)
- model_config¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- patch_size: int = None¶
Patch side length in cells. The grid resolution must be divisible by this value.
Token hidden dimension throughout the transformer stack.
- use_conditioning: bool = True¶
If True, enable AdaLN-Zero conditioning (forward requires
cond); if False, plain ViT (condmust beNone).
- use_conv_output_head: bool = True¶
If True, decode via a cascaded PixelShuffle conv head; if False, decode via a linear unpatchify.
- property transformer_block_config: noether.modeling.modules.blocks.transformer.TransformerBlockConfig¶