noether.modeling.models¶

Submodules¶

Classes¶

`AnchorBranchedUPTConfig`	Configuration for the Anchored Branched UPT (AB-UPT) model.
`AnchoredBranchedUPT`	Implementation of the Anchored Branched UPT (AB-UPT) model.
`AeroABUPT`	Aerodynamic Anchored-Branched UPT wrapper.
`AeroTransformer`	Aerodynamic Transformer wrapper.
`AeroTransformerConfig`	Transformer config extended with aerodynamic data specifications.
`AeroTransolver`	Aerodynamic Transolver wrapper.
`AeroTransolverConfig`	Transolver config extended with aerodynamic data specifications.
`AeroUPT`	Aerodynamic UPT wrapper.
`Transformer`	Implementation of a Transformer model.
`TransformerConfig`	Configuration for a Transformer model.
`Transolver`	Implementation of the Transolver model.
`TransolverConfig`	Configuration for a Transolver model.
`TransolverPlusPlusConfig`	Configuration for a Transolver++ model.
`UPT`	Implementation of the UPT (Universal Physics Transformer) model.
`UPTConfig`	Configuration for a UPT model.
`ViT`	Vision Transformer for spatial regression on continuous-coordinate grids.
`ViTConfig`	Configuration for ViT model

Package Contents¶

class noether.modeling.models.AnchorBranchedUPTConfig(/, **data)¶

Bases: noether.core.models.base.ModelBaseConfig, noether.core.schemas.mixins.InjectSharedFieldFromParentMixin

Configuration for the Anchored Branched UPT (AB-UPT) model.

AB-UPT is built from three configurable stages:

Geometry encoder (optional): a SupernodePooling encoder followed by geometry_depth standard transformer blocks. Only instantiated when at least one perceiver / perceiver_untied block is present in physics_blocks and supernode_pooling_config is provided.
Physics trunk: a stack of blocks listed in physics_blocks operating on per-domain anchor (and optionally query) tokens. The block string controls the attention pattern and weight sharing — see physics_blocks below.
Per-domain decoder (optional): num_domain_decoder_blocks[name] self-attention blocks with untied weights per domain, followed by a linear projection to that domain’s output fields.

hidden_dim is a shared field — it is auto-injected into transformer_block_config and supernode_pooling_config via InjectSharedFieldFromParentMixin, so it only needs to be set once at the top level. See Configuration Inheritance.

Configuration guide¶

See Configuring AB-UPT for a step-by-step walkthrough of how to compose physics blocks, choose between tied and _untied variants, and wire up the per-domain decoder.

Concrete examples (YAML):

Aerodynamics (multi-domain, surface + volume): recipes/aero_cfd/configs/model/ab_upt.yaml
Heat transfer (single-domain, volume only with parameter conditioning): recipes/heat_transfer/configs/model/ab_upt.yaml

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

kind: str | None = 'noether.core.schemas.models.AnchorBranchedUPTConfig'¶: Kind of model to use, i.e. class path

model_config¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

supernode_pooling_config: Annotated[noether.modeling.modules.encoders.supernode_pooling.SupernodePoolingConfig, noether.core.schemas.mixins.Shared] | None = None¶

transformer_block_config: Annotated[noether.modeling.modules.blocks.transformer.TransformerBlockConfig, noether.core.schemas.mixins.Shared]¶

geometry_depth: int = None¶: Number of transformer blocks in the geometry encoder.

hidden_dim: int = None¶: Hidden dimension of the model.

condition_dim: int | None = None¶

physics_blocks: list[Literal['self', 'shared', 'cross', 'joint', 'perceiver', 'self_untied', 'cross_untied', 'joint_untied', 'perceiver_untied']]¶

Types of physics blocks to use in the model.

self/shared: Self-attention within a branch/domain. Weights are shared between all domains. cross: Cross-attention between domains. Each domain attends to all other domains’ anchors, weights are shared. joint: Joint attention over all domain points. Full self-attention over all points, weights are shared. perceiver: Perceiver-style cross-attention to geometry encoding. self_untied: Self-attention within a branch with untied weights for each domain. cross_untied: Cross-attention between domains with untied weights for each domain. joint_untied: Joint attention over all domain points with untied weights for each domain. perceiver_untied: Perceiver cross-attention with geometry encoding and untied weights per domain.

Note: “shared” is a deprecated alias for “self” and will be removed in a future release.

num_domain_decoder_blocks: dict[str, int] = None¶

2, “volume”: 2}.

Type:: Number of final domain-specific decoder blocks with self attention and no weight sharing, e.g. {“surface”

init_weights: noether.core.types.InitWeightsMode = None¶: Weight initialization of linear layers. Defaults to “truncnormal002”.

drop_path_rate: float = None¶: Drop path rate for stochastic depth. Defaults to 0.0 (no drop path).

geometry_conditioning_dims: noether.data.schemas.FieldDimSpec | None = None¶: Per-named-field conditioning spec for geometry transformer blocks. When left unset, defaults to data_specs.conditioning_dims so the geometry branch sees the same conditioning as the rest of the model. An explicit empty FieldDimSpec (total_dim == 0) opts out — useful for diffusion, where timestep modulation should touch physics + per-domain decoders but not the geometry branch (geometry is invariant to noise level).

data_specs: noether.data.schemas.ModelDataSpecs¶: Data specifications for the model.

migrate_shared_to_self()¶

Migrate deprecated ‘shared’ block type to ‘self’.

Return type:: AnchorBranchedUPTConfig

rope_frequency_config()¶

Return type:: noether.modeling.modules.layers.rope_frequency.RopeFrequencyConfig

pos_embed_config()¶

Return type:: noether.modeling.modules.layers.continuous_sincos_embed.ContinuousSincosEmbeddingConfig

bias_mlp_config()¶

Return type:: noether.modeling.modules.mlp.MLPConfig

perceiver_block_config()¶

Return type:: noether.modeling.modules.blocks.perceiver.PerceiverBlockConfig

domain_decoder_configs()¶

Per-domain decoder projection configs, keyed by domain name.

Return type:: dict[str, noether.modeling.modules.layers.linear_projection.LinearProjectionConfig]

conditioner_config()¶

Configuration for the scalar conditioner module.

Return type:: noether.modeling.modules.layers.vectors_conditioner.VectorsConditionerConfig

set_condition_dim()¶

Set condition_dim in transformer_block_config based on data_specs.

Return type:: AnchorBranchedUPTConfig

geometry_transformer_block_config()¶

Transformer block config for geometry encoder, with condition_dim set to geometry_conditioning_dims.

Return type:: noether.modeling.modules.blocks.transformer.TransformerBlockConfig

geometry_conditioner_config()¶

Configuration for the scalar conditioner module.

Return type:: noether.modeling.modules.layers.vectors_conditioner.VectorsConditionerConfig

validate_parameters()¶

Validate validity of parameters across the model and its submodules.

Ensures that hidden_dim is consistent across parent and all submodules. Note: transformer_block_config validates hidden_dim % num_heads == 0 in its own validator.

Return type:: AnchorBranchedUPTConfig

Parameters:: data (Any)

class noether.modeling.models.AnchoredBranchedUPT(config)¶

Bases: torch.nn.Module

Implementation of the Anchored Branched UPT (AB-UPT) model.

This is an off-the-shelf model — it includes input embedding and output projection, so it can be used directly by providing the appropriate input tensors. See forward() for the expected inputs.

The architecture is fully driven by AnchorBranchedUPTConfig: the geometry encoder depth, the ordering and type of physics blocks, and the per-domain decoder depths are all configured there. For a walkthrough of how to assemble a config (and concrete YAML examples from the aero_cfd and heat_transfer recipes), see Configuring AB-UPT.

Parameters:: config (AnchorBranchedUPTConfig) – Configuration for the AB-UPT model. See AnchorBranchedUPTConfig for details.

data_specs¶

rope¶

pos_embed¶

domain_names: list[str]¶

domain_biases¶

hidden_dim¶

physics_blocks¶

use_geometry_branch = False¶

domain_feature_projs: torch.nn.ModuleDict | None = None¶

domain_decoder_blocks¶

domain_decoder_projections¶

geometry_branch_forward(geometry_position, geometry_supernode_idx, geometry_batch_idx, condition, geometry_attn_kwargs)¶

Forward pass through the geometry branch of the model.

Parameters:

geometry_position (torch.Tensor)
geometry_supernode_idx (torch.Tensor)
geometry_batch_idx (torch.Tensor)
condition (torch.Tensor | None)
geometry_attn_kwargs (dict[str, torch.Tensor])

Return type:

torch.Tensor

build_physics_input(domain_anchor_positions=None, domain_query_positions=None, domain_anchor_features=None, domain_query_features=None)¶

Build the physics-block input tensor and combined per-domain positions.

Each per-domain segment is [anchors | queries] with positional biases plus projected features (when data_specs.domains[name].feature_dim was set on the config). Domains are concatenated in self.domain_names order.

Returns:

Tuple of (x_physics, physics_positions). x_physics has shape (B, total_tokens, hidden_dim). physics_positions maps each domain name to its concatenated [anchors | queries] positions and can be passed directly to create_rope_frequencies().

Parameters:

domain_anchor_positions (dict[str, torch.Tensor] | None)
domain_query_positions (dict[str, torch.Tensor] | None)
domain_anchor_features (dict[str, torch.Tensor] | None)
domain_query_features (dict[str, torch.Tensor] | None)

Return type:

tuple[torch.Tensor, dict[str, torch.Tensor]]

physics_blocks_forward(x_physics, geometry_encoding, physics_token_specs, physics_attn_kwargs, physics_perceiver_attn_kwargs, condition, physics_blocks_cache=None)¶

Run the physics-block stack on a pre-built input tensor.

Perceiver blocks always re-project K/V from geometry_encoding and contribute None to the returned cache; only transformer blocks cache their anchor self-attention K/V.

Parameters:

x_physics (torch.Tensor)
geometry_encoding (torch.Tensor | None)
physics_token_specs (list[noether.core.schemas.modules.attention.TokenSpec])
physics_attn_kwargs (dict[str, Any])
physics_perceiver_attn_kwargs (dict[str, Any])
condition (torch.Tensor | None)
physics_blocks_cache (list[LayerCache | None] | None)

Return type:

tuple[torch.Tensor, list[LayerCache | None]]

decoder_blocks_forward(x_physics, physics_token_specs, per_domain_token_specs, decoder_attn_kwargs, condition, decoders_cache=None)¶

Forward pass through the per-domain decoder blocks.

Returns:

Tuple of (domain_predictions, new_domain_caches).

Parameters:

x_physics (torch.Tensor)
physics_token_specs (list[noether.core.schemas.modules.attention.TokenSpec])
per_domain_token_specs (dict[str, list[noether.core.schemas.modules.attention.TokenSpec]])
decoder_attn_kwargs (dict[str, dict[str, Any]])
condition (torch.Tensor | None)
decoders_cache (dict[str, list[LayerCache]] | None)

Return type:

tuple[dict[str, torch.Tensor], dict[str, list[LayerCache]]]

create_rope_frequencies(physics_positions, geometry_position=None, geometry_supernode_idx=None, geometry_rope=None)¶

Create RoPE frequencies for all relevant positions.

Parameters:

physics_positions (dict[str, torch.Tensor]) – Per-domain combined [anchors | queries] positions, as returned by build_physics_input().
geometry_position (torch.Tensor | None) – Geometry mesh coordinates (optional).
geometry_supernode_idx (torch.Tensor | None) – Geometry supernode indices (optional).
geometry_rope (torch.Tensor | None) – Precomputed geometry-supernode RoPE. When provided, bypasses geometry_position / geometry_supernode_idx for the perceiver k_freqs (needed in queries-only mode where geometry inputs aren’t available).

Returns:

Tuple of (geometry_attn_kwargs, decoder_attn_kwargs, physics_perceiver_attn_kwargs, physics_attn_kwargs, geometry_rope). geometry_rope is the rope tensor used / computed (or None when there’s no geometry branch).

Return type:

tuple[dict[str, Any], dict[str, dict[str, Any]], dict[str, Any], dict[str, Any], torch.Tensor | None]

forward(geometry_position=None, geometry_supernode_idx=None, geometry_batch_idx=None, domain_anchor_positions=None, domain_query_positions=None, domain_anchor_features=None, domain_query_features=None, conditioning_inputs=None, geometry_conditioning_inputs=None, kv_cache=None)¶

Forward pass of the AB-UPT model.

Example:

model(
    geometry_position=...,
    geometry_supernode_idx=...,
    geometry_batch_idx=...,
    domain_anchor_positions={"surface": surface_pos, "volume": volume_pos},
    domain_query_positions={"surface": query_pos},
    domain_anchor_features={"surface": surface_features, "volume": volume_features},
    domain_query_features={"surface": query_features},
    conditioning_inputs={"geometry_design_parameters": design_params},
)

Parameters:

geometry_position (torch.Tensor | None) – Coordinates of the geometry mesh. Tensor of shape (B * N_geometry, D_pos).
geometry_supernode_idx (torch.Tensor | None) – Supernode indices for the geometry points.
geometry_batch_idx (torch.Tensor | None) – Batch indices for the geometry points.
domain_anchor_positions (dict[str, torch.Tensor] | None) – Per-domain anchor positions, e.g. {"surface": (B, N, D), "volume": (B, M, D)}.
domain_query_positions (dict[str, torch.Tensor] | None) – Per-domain query positions (optional).
domain_anchor_features (dict[str, torch.Tensor] | None) – Per-domain anchor input features (optional), matching the shape of domain_anchor_positions.
domain_query_features (dict[str, torch.Tensor] | None) – Per-domain query input features (optional), matching the shape of domain_query_positions.
conditioning_inputs (dict[str, torch.Tensor] | None) – Conditioning tensors for physics + decoder blocks, e.g. {"geometry_design_parameters": (B, D)}.
geometry_conditioning_inputs (dict[str, torch.Tensor] | None) – Conditioning tensors for the geometry branch. When None and conditioning_inputs is set, the geometry branch automatically reuses conditioning_inputs if the configured geometry_conditioning_dims matches data_specs.conditioning_dims (the common case). Pass an explicit dict to feed a different conditioning to geometry, or leave it None when the geometry branch is unconditioned.
kv_cache (ModelKVCache | None) – KV cache from a previous forward call.

Returns:

Tuple of (predictions, kv_cache).

Return type:

tuple[dict[str, torch.Tensor], ModelKVCache]

class noether.modeling.models.AeroABUPT(model_config, **kwargs)¶

Bases: noether.core.models.model.Model

Aerodynamic Anchored-Branched UPT wrapper.

Bridges the factory’s (config, **kwargs) instantiation pattern to the core model. Converts flat kwargs (surface_anchor_position, volume_anchor_position, …) into the domain-dict format expected by AnchoredBranchedUPT.

Base class for single models, i.e. one model with one optimizer as opposed to CompositeModel.

Parameters:

model_config (noether.modeling.models.ab_upt.AnchorBranchedUPTConfig) – Model configuration. See ModelBaseConfig for available options.
update_counter – The UpdateCounter provided to the optimizer.
is_frozen – If true, will set requires_grad of all parameters to false. Will also put the model into eval mode (e.g., to put Dropout or BatchNorm into eval mode).
path_provider – PathProvider used by the initializer to store or retrieve checkpoints.
data_container – DataContainer which includes the data and dataloader. This is currently unused but helpful for quick prototyping only, evaluating forward in debug mode, etc.

backbone¶

forward(**kwargs)¶

Return type:: dict[str, torch.Tensor]

class noether.modeling.models.AeroTransformer(model_config, **kwargs)¶

Bases: noether.core.models.model.Model

Aerodynamic Transformer wrapper.

End-to-end forward for aero CFD: positional encoding, optional RoPE, optional physics features, surface/volume bias, Transformer backbone, output projection, and output gathering.

Base class for single models, i.e. one model with one optimizer as opposed to CompositeModel.

Parameters:

model_config (AeroTransformerConfig) – Model configuration. See ModelBaseConfig for available options.
update_counter – The UpdateCounter provided to the optimizer.
is_frozen – If true, will set requires_grad of all parameters to false. Will also put the model into eval mode (e.g., to put Dropout or BatchNorm into eval mode).
path_provider – PathProvider used by the initializer to store or retrieve checkpoints.
data_container – DataContainer which includes the data and dataloader. This is currently unused but helpful for quick prototyping only, evaluating forward in debug mode, etc.

data_specs¶

use_rope¶

pos_embed¶

surface_bias¶

volume_bias¶

use_physics_features¶

backbone¶

norm¶

out¶

forward(surface_position, volume_position, surface_features=None, volume_features=None)¶

Parameters:

surface_position (torch.Tensor)
volume_position (torch.Tensor)
surface_features (torch.Tensor | None)
volume_features (torch.Tensor | None)

Return type:

dict[str, torch.Tensor]

class noether.modeling.models.AeroTransformerConfig(/, **data)¶

Bases: noether.modeling.models.transformer.TransformerConfig

Transformer config extended with aerodynamic data specifications.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:: data (Any)

data_specs: noether.data.schemas.ModelDataSpecs¶

class noether.modeling.models.AeroTransolver(model_config, **kwargs)¶

Bases: noether.core.models.model.Model

Aerodynamic Transolver wrapper.

Like AeroTransformer but adds the Transolver-specific learnable placeholder parameter.

Base class for single models, i.e. one model with one optimizer as opposed to CompositeModel.

Parameters:

model_config (AeroTransolverConfig) – Model configuration. See ModelBaseConfig for available options.
update_counter – The UpdateCounter provided to the optimizer.
is_frozen – If true, will set requires_grad of all parameters to false. Will also put the model into eval mode (e.g., to put Dropout or BatchNorm into eval mode).
path_provider – PathProvider used by the initializer to store or retrieve checkpoints.
data_container – DataContainer which includes the data and dataloader. This is currently unused but helpful for quick prototyping only, evaluating forward in debug mode, etc.

data_specs¶

pos_embed¶

surface_bias¶

volume_bias¶

use_physics_features¶

placeholder¶

backbone¶

norm¶

out¶

forward(surface_position, volume_position, surface_features=None, volume_features=None)¶

Parameters:

surface_position (torch.Tensor)
volume_position (torch.Tensor)
surface_features (torch.Tensor | None)
volume_features (torch.Tensor | None)

Return type:

dict[str, torch.Tensor]

class noether.modeling.models.AeroTransolverConfig(/, **data)¶

Bases: noether.modeling.models.transolver.TransolverConfig

Transolver config extended with aerodynamic data specifications.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:: data (Any)

data_specs: noether.data.schemas.ModelDataSpecs¶

class noether.modeling.models.AeroUPT(model_config, **kwargs)¶

Bases: noether.core.models.model.Model

Aerodynamic UPT wrapper.

Combines separate surface/volume query positions into the single query_position that the core UPT expects, and splits outputs using ModelDataSpecs. Supports optional surface/volume bias layers and physics feature projection on queries.

Base class for single models, i.e. one model with one optimizer as opposed to CompositeModel.

Parameters:

model_config (noether.modeling.models.upt.UPTConfig) – Model configuration. See ModelBaseConfig for available options.
update_counter – The UpdateCounter provided to the optimizer.
is_frozen – If true, will set requires_grad of all parameters to false. Will also put the model into eval mode (e.g., to put Dropout or BatchNorm into eval mode).
path_provider – PathProvider used by the initializer to store or retrieve checkpoints.
data_container – DataContainer which includes the data and dataloader. This is currently unused but helpful for quick prototyping only, evaluating forward in debug mode, etc.

backbone¶

data_specs¶

use_bias_layers¶

use_physics_features¶

forward(surface_position_batch_idx, surface_position_supernode_idx, surface_position, surface_query_position, volume_query_position, surface_query_features=None, volume_query_features=None)¶

Parameters:

surface_position_batch_idx (torch.Tensor)
surface_position_supernode_idx (torch.Tensor)
surface_position (torch.Tensor)
surface_query_position (torch.Tensor)
volume_query_position (torch.Tensor)
surface_query_features (torch.Tensor | None)
volume_query_features (torch.Tensor | None)

Return type:

dict[str, torch.Tensor]

class noether.modeling.models.Transformer(config)¶

Bases: torch.nn.Module

Implementation of a Transformer model.

Parameters:: config (TransformerConfig) – Configuration of the Transformer model.

blocks¶

forward(x, attn_kwargs, condition=None)¶

Forward pass of the Transformer model.

Parameters:

x (torch.Tensor) – Input tensor of shape (batch_size, seq_len, hidden_dim).
attn_kwargs (dict[str, torch.Tensor]) – Additional arguments for the attention mechanism.
condition (torch.Tensor | None) – Optional conditioning vector of shape (batch_size, condition_dim) consumed by each block’s AdaLN-Zero modulation. None (default) for unconditioned models.

Returns:

Output tensor after processing through the Transformer model.

Return type:

torch.Tensor

class noether.modeling.models.TransformerConfig(/, **data)¶

Bases: noether.core.models.base.ModelBaseConfig, noether.core.schemas.mixins.InjectSharedFieldFromParentMixin

Configuration for a Transformer model.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:: data (Any)

model_config¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

hidden_dim: int = None¶: Hidden dimension of the model. Used for all transformer blocks.

depth: int = None¶: Number of transformer blocks in the model.

transformer_block_config: Annotated[noether.modeling.modules.blocks.transformer.TransformerBlockConfig, noether.core.schemas.mixins.Shared]¶

class noether.modeling.models.Transolver(config)¶

Bases: noether.modeling.models.transformer.Transformer

Implementation of the Transolver model. Reference code: https://github.com/thuml/Transolver/ Paper: https://arxiv.org/abs/2402.02366 Transolver is a Transformer with a special physics attention mechanism. Hence, we extend the Transformer class, and configure it accordingly.

Parameters:: config (TransolverConfig) – Configuration of the Transolver model.

class noether.modeling.models.TransolverConfig(/, **data)¶

Bases: noether.modeling.models.transformer.TransformerConfig, noether.core.models.base.ModelBaseConfig

Configuration for a Transolver model.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:: data (Any)

model_config¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

attention_arguments: dict¶

set_attention_constructor()¶

Set attention_constructor in transformer_block_config based on data_specs.

Return type:: TransolverConfig

class noether.modeling.models.TransolverPlusPlusConfig(/, **data)¶

Bases: TransolverConfig

Configuration for a Transolver++ model.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:: data (Any)

set_attention_constructor()¶

Set attention_constructor in transformer_block_config based on data_specs.

Return type:: TransolverPlusPlusConfig

class noether.modeling.models.UPT(config)¶

Bases: torch.nn.Module

Implementation of the UPT (Universal Physics Transformer) model.

Parameters:: config (UPTConfig) – Configuration for the UPT model. See UPTConfig for details.

use_rope¶

encoder¶

pos_embed¶

approximator_blocks¶

decoder¶

norm¶

prediction_layer¶

compute_rope_args(geometry_batch_idx, geometry_position, geometry_supernode_idx, query_position)¶

Compute the RoPE frequency arguments for the geometry and query positions. If RoPE is not used, return empty dicts.

Parameters:

geometry_batch_idx (torch.Tensor)
geometry_position (torch.Tensor)
geometry_supernode_idx (torch.Tensor)
query_position (torch.Tensor)

Return type:

tuple[dict[str, torch.Tensor], dict[str, torch.Tensor]]

forward(geometry_batch_idx, geometry_supernode_idx, geometry_position, query_position)¶

Forward pass of the UPT model.

Parameters:

geometry_batch_idx (torch.Tensor) – Batch indices for the geometry positions.
geometry_supernode_idx (torch.Tensor) – Supernode indices for the geometry positions.
geometry_position (torch.Tensor) – Input coordinates of the geometry mesh points.
query_position (torch.Tensor) – Input coordinates of the query points.

Returns:

Output tensor containing the predictions at query positions.

Return type:

torch.Tensor

class noether.modeling.models.UPTConfig(/, **data)¶

Bases: noether.core.models.base.ModelBaseConfig, noether.core.schemas.mixins.InjectSharedFieldFromParentMixin

Configuration for a UPT model.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:: data (Any)

model_config¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

num_heads: int = None¶: Number of attention heads in the model.

hidden_dim: int = None¶: Hidden dimension of the model.

mlp_expansion_factor: int = None¶: Expansion factor for the MLP of the FF layers.

approximator_depth: int = None¶: Number of approximator layers.

use_rope: bool = None¶

bias: bool = None¶: Whether to use bias terms in the model’s linear layers.

supernode_pooling_config: Annotated[noether.modeling.modules.encoders.supernode_pooling.SupernodePoolingConfig, noether.core.schemas.mixins.Shared]¶

approximator_config: Annotated[noether.modeling.modules.blocks.transformer.TransformerBlockConfig, noether.core.schemas.mixins.Shared]¶

decoder_config: Annotated[noether.modeling.modules.decoders.deep_perceiver.DeepPerceiverDecoderConfig, noether.core.schemas.mixins.Shared]¶

bias_layers: bool = None¶

data_specs: noether.data.schemas.ModelDataSpecs¶

linear_output_projection_config()¶

Return type:: noether.modeling.modules.layers.linear_projection.LinearProjectionConfig

rope_frequency_config()¶

Return type:: noether.modeling.modules.layers.rope_frequency.RopeFrequencyConfig

validate_rope_usage()¶

Ensure that if use_rope is True in the main config, it is also True in the approximator_config.

Return type:: UPTConfig

pos_embedding_config()¶

Return type:: noether.modeling.modules.layers.continuous_sincos_embed.ContinuousSincosEmbeddingConfig

validate_parameters()¶

Validate validity of parameters across the model and its submodules.

Ensures that: 1. hidden_dim is divisible by num_heads in parent and all submodules with num_heads 2. hidden_dim is consistent across parent and all submodules

Return type:: UPTConfig

class noether.modeling.models.ViT(config)¶

Bases: torch.nn.Module

Vision Transformer for spatial regression on continuous-coordinate grids.

Based on the ViT paper (https://arxiv.org/pdf/2010.11929) with several modifications, such as:

Continuous coordinate inputs with sincos positional embedding and RoPE (vs. learned 1D position embeddings).
Optional AdaLN-Zero conditioning, à la DiT (https://arxiv.org/abs/2212.09748).
RMSNorm and QK-norm in attention (vs. LayerNorm only).

Parameters:: config (ViTConfig) – Configuration for the ViT model. See ViTConfig for available options.

coord_dim¶

out_channels¶

patch_size¶

hidden_dim¶

num_heads¶

token_dropout¶

use_conditioning¶

pool_patch¶

mask_patchify¶

pos_embedding¶

rope¶

backbone¶

use_conv_output_head¶

initialize_weights()¶

Initialize backbone weights

Return type:: None

unpatchify(x, grid_h, grid_w)¶

Linear unpatchify: (B, L, p²·C_out) → (B, H, W, C_out).

Parameters:

x (torch.Tensor)
grid_h (int)
grid_w (int)

Return type:

torch.Tensor

forward(x, coords, mask=None, cond=None, return_tokens=False)¶

Run the standard ViT.

Parameters:

x (torch.Tensor | None) – Optional pre-computed patch embeddings of shape (B, L, hidden_dim). When None, tokens come purely from positional encoding.
coords (torch.Tensor) – Per-cell coordinates of shape (B, H, W, coord_dim).
mask (torch.Tensor | None) – Optional per-cell fluid mask of shape (B, H, W).
cond (torch.Tensor | None) – AdaLN conditioning vector of shape (B, hidden_dim). Required when the ViT was built with use_conditioning=True (the default); must be None otherwise.
return_tokens (bool) – If True, return raw post-FinalLayer tokens plus (grid_h, grid_w) instead of the decoded spatial output.

Returns:

Either (B, H, W, out_channels) or (tokens, (grid_h, grid_w)) if return_tokens.

Return type:

torch.Tensor | tuple[torch.Tensor, tuple[int, int]]

class noether.modeling.models.ViTConfig(/, **data)¶

Bases: noether.core.models.base.ModelBaseConfig

Configuration for ViT model

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:: data (Any)

model_config¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

coord_dim: int = None¶: Coordinate dimensionality of the input grid (2 for 2D, 3 for 3D).

out_channels: int = None¶: Number of output channels emitted per spatial cell.

patch_size: int = None¶: Patch side length in cells. The grid resolution must be divisible by this value.

hidden_dim: int = None¶: Token hidden dimension throughout the transformer stack.

num_heads: int = None¶: Number of attention heads in each transformer block.

depth: int = None¶: Number of stacked transformer blocks.

mlp_ratio: int = None¶: FFN expansion factor inside each transformer block.

use_conditioning: bool = True¶: If True, enable AdaLN-Zero conditioning (forward requires cond); if False, plain ViT (cond must be None).

token_dropout: float = None¶: Per-patch token dropout probability used during training.

attn_drop: float = None¶: Dropout probability inside attention.

use_conv_output_head: bool = True¶: If True, decode via a cascaded PixelShuffle conv head; if False, decode via a linear unpatchify.

property transformer_block_config: noether.modeling.modules.blocks.transformer.TransformerBlockConfig¶

Return type:: noether.modeling.modules.blocks.transformer.TransformerBlockConfig