noether.modeling.modules

Submodules

Classes

Activation

Create a collection of name/value pairs.

DotProductAttention

Scaled dot-product attention module.

PerceiverAttention

Perceiver style attention module. This module is similar to a cross-attention modules.

TransolverAttention

Adapted from https://github.com/thuml/Transolver/blob/main/Car-Design-ShapeNetCar/models/Transolver.py

PerceiverBlock

For a self-attention module, the input tensor for the query, key, and value are the same. The PerceiverBlock,

TransformerBlock

A transformer block with a single attention layer and a feedforward layer.

DeepPerceiverDecoder

A deep Perceiver decoder module. Can be configured with different number of layers and hidden dimensions.

SupernodePooling

Supernode pooling layer.

ContinuousSincosEmbed

Embedding layer for continuous coordinates using sine and cosine functions.

LayerScale

LayerScale module scales the input tensor by a learnable parameter gamma.

LinearProjection

LinearProjection is a linear projection layer that can be used for 1D, 2D, and 3D data.

UnquantizedDropPath

Unquantized drop path (Stochastic Depth, https://arxiv.org/abs/1603.09382) per sample. Unquantized means

MLP

Implements a Multi-Layer Perceptron (MLP) with configurable number of layers, hidden dimension activation functions and weight initialization methods.

UpActDownMlp

UpActDownMlp is a vanilla MLP with an up-projection followed by an GELU activation function and a

Package Contents

class noether.modeling.modules.Activation(*args, **kwds)

Bases: enum.Enum

Create a collection of name/value pairs.

Example enumeration:

>>> class Color(Enum):
...     RED = 1
...     BLUE = 2
...     GREEN = 3

Access them by:

  • attribute access:

    >>> Color.RED
    <Color.RED: 1>
    
  • value lookup:

    >>> Color(1)
    <Color.RED: 1>
    
  • name lookup:

    >>> Color['RED']
    <Color.RED: 1>
    

Enumerations can be iterated over, and know how many members they have:

>>> len(Color)
3
>>> list(Color)
[<Color.RED: 1>, <Color.BLUE: 2>, <Color.GREEN: 3>]

Methods can be added to enumerations, and members can have their own attributes – see the documentation for details.

GELU
TANH
SIGMOID
RELU
LEAKY_RELU
SOFTPLUS
ELU
SILU
class noether.modeling.modules.DotProductAttention(config)

Bases: torch.nn.Module

Scaled dot-product attention module.

Parameters:

config (noether.core.schemas.modules.AttentionConfig) – Configuration for the DotProductAttention module. See AttentionConfig for available options.

num_heads = None
head_dim
init_weights = None
use_rope = None
dropout = None
proj_dropout
qkv
proj
forward(x, attn_mask=None, freqs=None)

Forward function of the DotProductAttention module.

Parameters:
  • x (torch.Tensor) – Tensor to apply self-attention over, shape (batch size, sequence length, hidden_dim).

  • attn_mask (torch.Tensor | None) – For causal attention (i.e., no attention over the future token) a attention mask should be provided. Defaults to None.

  • freqs (torch.Tensor | None) – Frequencies for Rotary Positional Embedding (RoPE) of queries/keys. None if use_rope=False.

Returns:

Returns the output of the attention module.

Return type:

torch.Tensor

class noether.modeling.modules.PerceiverAttention(config)

Bases: torch.nn.Module

Perceiver style attention module. This module is similar to a cross-attention modules.

Parameters:

config (noether.core.schemas.modules.AttentionConfig) – Configuration for the PerceiverAttention module. See AttentionConfig for available options.

num_heads = None
head_dim
init_weights = None
use_rope = None
kv
q
proj
dropout = None
proj_dropout
forward(q, kv, attn_mask=None, q_freqs=None, k_freqs=None)

Forward function of the PerceiverAttention module.

Parameters:
  • q (torch.Tensor) – Query tensor, shape (batch size, number of points/tokens, hidden_dim).

  • kv (torch.Tensor) – Key/value tensor, shape (batch size, number of latent tokens, hidden_dim).

  • attn_mask (torch.Tensor | None) – When applying causal attention, an attention mask is required. Defaults to None.

  • q_freqs (torch.Tensor | None) – Frequencies for Rotary Positional Embedding (RoPE) of queries. None if use_rope=False.

  • k_freqs (torch.Tensor | None) – Frequencies for Rotary Positional Embedding (RoPE) of keys. None if use_rope=False.

Returns:

Returns the output of the perceiver attention module.

Return type:

torch.Tensor

class noether.modeling.modules.TransolverAttention(config)

Bases: torch.nn.Module

Adapted from https://github.com/thuml/Transolver/blob/main/Car-Design-ShapeNetCar/models/Transolver.py - Readable reshaping operations via einops - Merged qkv linear layer for higher GPU utilization - F.scaled_dot_product_attention instead of slow pytorch attention - Possibility to mask tokens (required to process variable sized inputs)

Parameters:

config (noether.core.schemas.modules.AttentionConfig) – Configuration for the Transolver attention module. See AttentionConfig for available options.

num_heads = None
dropout = None
temperature
in_project_x
in_project_fx
in_project_slice
qkv
proj
proj_dropout
create_slices(x, num_input_points, attn_mask=None)

Given a set of points, project them to a fixed number of slices using the computed the slice weights per token.

Parameters:
  • x (torch.Tensor) – Input tensor with shape (batch_size, num_input_points, hidden_dim).

  • num_input_points (int) – Number of input points.

  • attn_mask (torch.Tensor | None) – Mask to mask out certain token for the attention. Defaults to None.

Returns:

Tensor with the projected slice tokens and the slice weights.

forward(x, attn_mask=None)

Forward pass of the Transolver attention module.

Parameters:
  • x (torch.Tensor) – Input tensor with shape (batch_size, seqlen, hidden_dim).

  • attn_mask (torch.Tensor | None) – Attention mask tensor with shape (batch_size). Defaults to None.

Returns:

Tensor after applying the transolver attention mechanism.

class noether.modeling.modules.PerceiverBlock(config)

Bases: torch.nn.Module

For a self-attention module, the input tensor for the query, key, and value are the same. The PerceiverBlock, takes different input tensors for the query and the key/value.

Parameters:
norm1q
norm1kv
attn
ls1
drop_path1
norm2
mlp
ls2
drop_path2
forward(q, kv, condition=None, attn_kwargs=None)

Forward pass of the PerceiverBlock.

Parameters:
  • q (torch.Tensor) – Input tensor with shape (batch_size, seqlen/num_tokens, hidden_dim) for the query representations.

  • kv (torch.Tensor) – Input tensor with shape (batch_size, seqlen/num_tokens, hidden_dim) for the key and value representations.

  • condition (torch.Tensor | None) – Conditioning vector. If provided, the attention and MLP will be scaled, shifted and gated feature-wise with predicted values from this vector.

  • attn_kwargs (dict[str, Any] | None) – Dict with arguments for the attention (such as the attention mask or rope frequencies). Defaults to None.

Returns:

Tensor after the forward pass of the PerceiverBlock.

Return type:

torch.Tensor

class noether.modeling.modules.TransformerBlock(config)

Bases: torch.nn.Module

A transformer block with a single attention layer and a feedforward layer.

Parameters:

config (noether.core.schemas.modules.blocks.TransformerBlockConfig) – Configuration for the transformer block. See TransformerBlockConfig for available options.

norm1
attention_block
ls1
drop_path1
norm2
mlp
ls2
drop_path2
forward(x, condition=None, attn_kwargs=None)

Forward pass of the transformer block.

Parameters:
  • x (torch.Tensor) – Input tensor with shape (batch_size, seqlen/num_tokens, hidden_dim).

  • condition (torch.Tensor | None) – Conditioning vector. If provided, the attention and MLP will be scaled, shifted and gated feature-wise with predicted values from this vector.

  • attn_kwargs (dict[str, Any] | None) – Dict with arguments for the attention (such as the attention mask or rope frequencies). Defaults to None.

Returns:

Tensor after the forward pass of the transformer block.

Return type:

torch.Tensor

class noether.modeling.modules.DeepPerceiverDecoder(config)

Bases: torch.nn.Module

A deep Perceiver decoder module. Can be configured with different number of layers and hidden dimensions. However, it should be noted that this layer is not a full-fledged Perceiver, since it only has a cross-attention mechanism.

Parameters:

config (noether.core.schemas.modules.decoders.DeepPerceiverDecoderConfig) – Configuration for the DeepPerceiverDecoder module. See DeepPerceiverDecoderConfig for available options.

blocks
forward(kv, queries, attn_kwargs=None, condition=None)

Forward pass of the model.

Parameters:
  • kv (torch.Tensor) – The key-value tensor (batch_size, num_latent_tokens, dim).

  • queries (torch.Tensor) – The query tensor (batch_size, num_output_queries, dim).

  • attn_kwargs (dict[str, Any] | None) – Dict with arguments for the attention (such as the attention mask or rope frequencies). Defaults to None.

  • condition (torch.Tensor | None) – Optional conditioning tensor that can be used in the attention mechanism. This can be used to pass additional conditioning information, etc.

Returns:

The predictions as sparse tensor (batch_size * num_output_pos, num_out_values).

Return type:

torch.Tensor

class noether.modeling.modules.SupernodePooling(config)

Bases: torch.nn.Module

Supernode pooling layer.

The permutation of the supernodes is preserved through the message passing (contrary to the (GP-)UPT code). Additionally, radius is used instead of radius_graph, which is more efficient.

Initialize the SupernodePooling.

Parameters:

config (noether.core.schemas.modules.encoders.SupernodePoolingConfig) – Configuration for the SupernodePooling module. See SupernodePoolingConfig for available options.

radius
k
max_degree
spool_pos_mode
readd_supernode_pos
aggregation
input_features_dim
pos_embed
output_dim
compute_src_and_dst_indices(input_pos, supernode_idx, batch_idx=None)

Compute the source and destination indices for the message passing to the supernodes.

Parameters:
  • input_pos (torch.Tensor) – Sparse tensor with shape (batch_size * number of points, 3), representing the input geometries.

  • supernode_idx (torch.Tensor) – Indexes of the supernodes in the sparse tensor input_pos.

  • batch_idx (torch.Tensor | None) – 1D tensor, containing the batch index of each entry in input_pos. Default None.

Returns:

Tuple of (src_idx, dst_idx, local_dst_idx) where src_idx and dst_idx are absolute indices into input_pos and local_dst_idx is a 0-indexed position into supernode_idx (used for scatter_reduce_).

Return type:

tuple[torch.Tensor, torch.Tensor, torch.Tensor]

create_messages(input_pos, src_idx, dst_idx, supernode_idx, input_features=None)

Create messages for the message passing to the supernodes, based on different positional encoding representations.

Parameters:
  • input_pos (torch.Tensor) – Tensor of shape (batch_size * number_of_points_per_sample, {2,3}), representing the point cloud representation of the input geometry.

  • src_idx (torch.Tensor) – Index of the source nodes from input_pos.

  • dst_idx (torch.Tensor) – Source index of the destination nodes from input_pos tensor. These indexes should be the matching supernode indexes.

  • supernode_idx (torch.Tensor) – Indexes of the node in input_pos that are considered supernodes.

  • input_features (torch.Tensor | None)

Raises:

NotImplementedError – Raised if the mode is not implemented. Either “abspos”, “relpos” or “absrelpos” are allowed.

Returns:

Tensor with messages for the message passing into the super nodes and the embedding coordinates of the

supernodes.

Return type:

tuple[torch.Tensor, torch.Tensor]

accumulate_messages(x, local_dst_idx, supernode_idx)

Method to accumulate the messages of neighbouring points into the supernodes.

Parameters:
  • x (torch.Tensor) – Tensor containing the message representation of each neighbour representation.

  • local_dst_idx (torch.Tensor) – 0-indexed position into supernode_idx for each message (no CUDA sync).

  • supernode_idx (torch.Tensor) – Indexes of the supernode in the input point cloud.

Returns:

Tensor with the aggregated messages for each supernode.

Return type:

torch.Tensor

forward(input_pos, supernode_idx, batch_idx=None, input_features=None)

Forward pass of the supernode pooling layer.

Parameters:
  • input_pos (torch.Tensor) – Sparse tensor with shape (batch_size * number_of_points_per_sample, 3), representing the point cloud representation of the input geometry.

  • supernode_idx (torch.Tensor) – indexes of the supernodes in the sparse tensor input_pos.

  • batch_idx (torch.Tensor | None) – 1D tensor, containing the batch index of each entry in input_pos. Default None.

  • input_features (torch.Tensor | None) – Sparse tensor with shape (batch_size * number_of_points_per_sample, number_of_features)

Returns:

Tensor with the aggregated messages for each supernode.

Return type:

torch.Tensor | dict[str, torch.Tensor]

class noether.modeling.modules.ContinuousSincosEmbed(config)

Bases: torch.nn.Module

Embedding layer for continuous coordinates using sine and cosine functions. The original implementation from the Attenion is All You Need paper, deals with descrete 1D cordinates (i.e., a sequence). Howerver, this implementation is able to deal with 2D and 3D coordinate systems as well.

Parameters:

config (noether.core.schemas.modules.layers.ContinuousSincosEmbeddingConfig) – Configuration for the ContinuousSincosEmbed module. See ContinuousSincosEmbeddingConfig for the available options.

omega: torch.Tensor
padding_tensor: torch.Tensor
hidden_dim
input_dim
ndim_padding
sincos_padding
max_wavelength
padding
forward(coords)

Forward method of the ContinuousSincosEmbed layer.

Parameters:

coords (torch.Tensor) – Tensor of coordinates. The shape of the tensor should be [batch size, number of points, coordinate dimension] or [number of points, coordinate dimension].

Raises:

NotImplementedError – Only supports sparse (i.e. [number of points, coordinate dimension]) or dense (i.e. [batch size, number of points, coordinate dimension]) coordinates systems.

Returns:

Tensor with embedded coordinates.

Return type:

torch.Tensor

class noether.modeling.modules.LayerScale(config)

Bases: torch.nn.Module

LayerScale module scales the input tensor by a learnable parameter gamma.

Initialize the LayerScale module. :param config: Configuration for the LayerScale module. See LayerScaleConfig for details.

Parameters:

config (noether.core.schemas.modules.layers.LayerScaleConfig)

forward(x)

Forward function of the LayerScale module.

Parameters:

x (torch.Tensor) – Input tensor to be scaled.

Returns:

Tensor scaled by the gamma parameter.

Return type:

torch.Tensor

class noether.modeling.modules.LinearProjection(config)

Bases: torch.nn.Module

LinearProjection is a linear projection layer that can be used for 1D, 2D, and 3D data.

Parameters:

config (noether.core.schemas.modules.layers.LinearProjectionConfig) – The configuration of the LinearProjection. See LinearProjectionConfig for available options.

Raises:

NotImplementedError – raises not implemented error if the number of dimensions of the input domain is bigger than 4.

project: torch.nn.Linear | torch.nn.Conv1d | torch.nn.Conv2d | torch.nn.Conv3d | torch.nn.Identity
init_weights
reset_parameters()
Reset the parameters of the MLP with a specific initialization. Options are “torch” (i.e., default) or

“truncnormal002”.

Raises:

NotImplementedError – raised if the specified initialization is not implemented.

Return type:

None

forward(x)

Forward function of the LinearProjection.

Parameters:

x (torch.Tensor) – Input tensor to the LinearProjection.

Returns:

Output tensor from the LinearProjection.

Return type:

torch.Tensor

class noether.modeling.modules.UnquantizedDropPath(config)

Bases: torch.nn.Module

Unquantized drop path (Stochastic Depth, https://arxiv.org/abs/1603.09382) per sample. Unquantized means that dropped paths are still calculated. Number of dropped paths is fully stochastic, i.e., it can happen that not a single path is dropped or that all paths are dropped. In a quantized drop path, the same amount of paths are dropped in each forward pass, resulting in large speedups with high drop_prob values. See https://arxiv.org/abs/2212.04884 for more discussion. UnquantizedDropPath does not provide any speedup, consider using a quantized version if large drop_prob values are used.

Adapted from https://github.com/huggingface/pytorch-image-models/blob/main/timm/layers/drop.py#L150

Initialize the UnquantizedDropPath module.

Parameters:

config (noether.core.schemas.modules.layers.UnquantizedDropPathConfig) – Configuration for the UnquantizedDropPath module. See UnquantizedDropPathConfig for the available options.

drop_prob
scale_by_keep
property keep_prob

Return the keep probability. I.e. the probability to keep a path, which is 1 - drop_prob.

Returns:

Float value of the keep probability.

forward(x)

Forward function of the UnquantizedDropPath module.

Parameters:

x (torch.Tensor) – Tensor to apply the drop path. Shape: (batch_size, …).

Returns:

(batch_size, …). If drop_prob is 0, the input tensor is returned. If drop_prob is 1, a tensor with zeros is returned.

Return type:

Tensor with drop path applied. Shape

extra_repr()

Extra representation of the UnquantizedDropPath module.

Returns:

Return a string representation of the module.

class noether.modeling.modules.MLP(config)

Bases: torch.nn.Module

Implements a Multi-Layer Perceptron (MLP) with configurable number of layers, hidden dimension activation functions and weight initialization methods. Only one hidden dimension is supported for simplicity, i.e., all hidden layers have the same dimension. The MLP will always have one input layer and one output layer. When num_layers=0, the MLP is a two layer network with one non-linearity in between. When num_layers>=1, the MLP has additional hidden layers, etc.

Initialize the MLP.

Parameters:

config (noether.core.schemas.modules.mlp.MLPConfig) – Configuration object for the MLP. See MLPConfig for available options.

init_weights
mlp
reset_parameters()
Reset the parameters of the MLP with a specific initialization. Options are “torch” (i.e., default), or

“truncnormal002”.

Raises:

NotImplementedError – raised if the specified initialization is not implemented.

Return type:

None

forward(x)

Forward function of the MLP.

Parameters:

x (torch.Tensor) – Input tensor to the MLP.

Returns:

Output tensor from the MLP.

Return type:

torch.Tensor

class noether.modeling.modules.UpActDownMlp(config)

Bases: torch.nn.Module

UpActDownMlp is a vanilla MLP with an up-projection followed by an GELU activation function and a down-projection to the original input dim.

Initialize the UpActDownMlp.

Parameters:

config (noether.core.schemas.modules.mlp.UpActDownMLPConfig) – The configuration of the UpActDownMlp.

init_weights
fc1
act
fc2
reset_parameters()
Reset the parameters of the MLP with a specific initialization. Options are “torch” (i.e., default), or

“truncnormal002”.

Raises:

NotImplementedError – raised if the specified initialization is not implemented.

Return type:

None

forward(x)

Forward function of the UpActDownMlp.

Parameters:

x (torch.Tensor) – Input tensor to the MLP.

Returns:

Output tensor from the MLP.

Return type:

torch.Tensor