Configuring AB-UPT¶
This guide walks through how to configure
AnchoredBranchedUPT via
AnchorBranchedUPTConfig. It is structured around
the three stages of the model — the geometry encoder, the physics trunk, and the
per-domain decoder — and references the YAML configs shipped with the aero_cfd and
heat_transfer recipes for concrete examples.
For background on the model itself, see the AB-UPT paper and Noether Model Zoo.
At a glance¶
A minimal AB-UPT YAML config looks like this (from recipes/heat_transfer/configs/model/ab_upt.yaml):
kind: noether.modeling.models.AeroABUPT
name: ab_upt
geometry_depth: 1
hidden_dim: 192
supernode_pooling_config:
input_dim: ${data_specs.position_dim}
k: 32
transformer_block_config:
num_heads: 3
mlp_expansion_factor: 4
use_rope: true
physics_blocks:
- perceiver
- self
- self
- self
- self
- self
num_domain_decoder_blocks:
volume: 2
optimizer_config: ${optimizer}
forward_properties:
- volume_anchor_position
- simulation_parameters
- geometry_position
- geometry_batch_idx
- geometry_supernode_idx
data_specs: ${data_specs}
The key parameters are:
hidden_dim— shared across the geometry encoder, physics trunk, and decoder. Auto-injected intotransformer_block_configandsupernode_pooling_configvia Configuration Inheritance, so you only set it once.transformer_block_config— attention head count, MLP expansion, RoPE settings. AB-UPT requiresuse_rope: true.supernode_pooling_config— geometry encoder front-end. Only required ifphysics_blockscontains aperceiverblock.physics_blocks— ordered list of block types (see below).num_domain_decoder_blocks— per-domain self-attention head depth.data_specs— defines the domains and their input/output fields. Drives the per-domain decoder projections automatically via thedomain_decoder_configscomputed field.
Stage 1: the geometry encoder¶
The geometry encoder is only instantiated when the physics trunk contains a
``perceiver`` block and supernode_pooling_config is set. It runs a
SupernodePooling encoder over the
geometry mesh, followed by geometry_depth standard transformer blocks.
In the heat-transfer config above, geometry_depth: 1 and a single perceiver
block in physics_blocks is enough to attend to the geometry encoding once at the
start of the trunk.
Skip the geometry branch entirely by:
Removing all
perceiver/perceiver_untiedentries fromphysics_blocks, orLeaving
supernode_pooling_configunset.
In that case, the model operates purely on per-domain anchor positions.
Stage 2: the physics trunk (physics_blocks)¶
physics_blocks is the ordered list of attention blocks applied to the concatenated
per-domain anchor tokens. Each entry is one of:
Block |
Weights |
Description |
|---|---|---|
|
shared |
Self-attention within each domain branch independently. Token mixing does not cross domain boundaries. |
|
shared |
Cross-attention from each domain to all other domains’ anchors. Lets domains exchange information. |
|
shared |
Full self-attention over all anchors from all domains jointly. |
|
shared |
Cross-attention from anchors to the geometry encoding. Requires the geometry branch. |
|
per-domain |
Same attention pattern as the un-suffixed variant, but with separate weights
for each domain (wrapped in |
Note
shared is a deprecated alias for self; it still works but emits a
DeprecationWarning and will be removed in a future release.
Choosing tied vs. untied¶
Use the _untied variants when domains have substantially different statistics
(e.g. surface vs. volume in CFD) and you have enough data to fit per-domain weights.
Tied (shared) blocks are smaller and regularize across domains; untied blocks have
more capacity per domain at the cost of parameters.
Concrete patterns¶
Aerodynamics (two domains, surface + volume) — from recipes/aero_cfd/configs/model/ab_upt.yaml:
physics_blocks:
- perceiver
- self
- cross
- self
- cross
- self
- cross
- self
- cross
- self
A single perceiver block reads geometry once, then the trunk alternates
self (in-domain mixing) with cross (cross-domain exchange) for four cycles,
ending in a self block.
Heat transfer (single volume domain) — from recipes/heat_transfer/configs/model/ab_upt.yaml:
physics_blocks:
- perceiver
- self
- self
- self
- self
- self
With only one domain, cross and joint are equivalent to self, so the
trunk is a stack of perceiver + self blocks.
Stage 3: optional per-domain decoder¶
After the physics trunk, num_domain_decoder_blocks optionally adds untied self-attention
blocks per domain (no weight sharing across domains), then a linear projection to
that domain’s output fields. This is semantically equivalent to adding the same number of
self_untied blocks to the end of the trunk for each domain. The only difference is,
that the decoder block depth can be set per-domain.
The output fields and their slicing are derived from
data_specs.domains[name].output_dims.
num_domain_decoder_blocks:
surface: 2
volume: 2
Set per-domain depths to 0 (or omit the entry) to skip the decoder block stack
and project directly from the trunk output. The output projection itself is always
present.
Other knobs¶
init_weights— weight initialization mode for linear layers; defaults to"truncnormal002".drop_path_rate— stochastic depth rate; defaults to0.0.data_specs.conditioning_dims— when set, the total conditioning dimension is pushed intotransformer_block_config.condition_dimand used by the conditioning path of every transformer/perceiver block. The heat-transfer recipe uses this to condition on simulation parameters.
Putting it all together¶
End-to-end training configs (model + dataset + pipeline + trainer + callbacks) for both recipes:
Aerodynamics: recipes/aero_cfd/configs/ — see
train_*.yamlfor per-dataset entry points andexperiment/<dataset>/ab_upt.yamlfor AB-UPT-specific overrides.Heat transfer: recipes/heat_transfer/configs/ — start from
train_simshift_heatsink.yamland the+experiment/simshift_heatsink=ab_uptoverride.
Both recipes are wired up to the noether-train CLI; see
Working with the CLI and How to launch a SLURM job from the command line for how to run
them locally or on SLURM.