noether.modeling.modules¶
Submodules¶
Classes¶
Create a collection of name/value pairs. |
|
Scaled dot-product attention module. |
|
Perceiver style attention module. This module is similar to a cross-attention modules. |
|
Adapted from https://github.com/thuml/Transolver/blob/main/Car-Design-ShapeNetCar/models/Transolver.py |
|
For a self-attention module, the input tensor for the query, key, and value are the same. The PerceiverBlock, |
|
A transformer block with a single attention layer and a feedforward layer. |
|
A deep Perceiver decoder module. Can be configured with different number of layers and hidden dimensions. |
|
Supernode pooling layer. |
|
Embedding layer for continuous coordinates using sine and cosine functions. |
|
LayerScale module scales the input tensor by a learnable parameter gamma. |
|
LinearProjection is a linear projection layer that can be used for 1D, 2D, and 3D data. |
|
Unquantized drop path (Stochastic Depth, https://arxiv.org/abs/1603.09382) per sample. Unquantized means |
|
Implements a Multi-Layer Perceptron (MLP) with configurable number of layers, hidden dimension activation functions and weight initialization methods. |
|
UpActDownMlp is a vanilla MLP with an up-projection followed by an GELU activation function and a |
Package Contents¶
- class noether.modeling.modules.Activation(*args, **kwds)¶
Bases:
enum.EnumCreate a collection of name/value pairs.
Example enumeration:
>>> class Color(Enum): ... RED = 1 ... BLUE = 2 ... GREEN = 3
Access them by:
attribute access:
>>> Color.RED <Color.RED: 1>
value lookup:
>>> Color(1) <Color.RED: 1>
name lookup:
>>> Color['RED'] <Color.RED: 1>
Enumerations can be iterated over, and know how many members they have:
>>> len(Color) 3
>>> list(Color) [<Color.RED: 1>, <Color.BLUE: 2>, <Color.GREEN: 3>]
Methods can be added to enumerations, and members can have their own attributes – see the documentation for details.
- GELU¶
- TANH¶
- SIGMOID¶
- RELU¶
- LEAKY_RELU¶
- SOFTPLUS¶
- ELU¶
- SILU¶
- class noether.modeling.modules.DotProductAttention(config)¶
Bases:
torch.nn.ModuleScaled dot-product attention module.
- Parameters:
config (noether.core.schemas.modules.AttentionConfig) – Configuration for the DotProductAttention module. See
AttentionConfigfor available options.
- num_heads = None¶
- head_dim¶
- init_weights = None¶
- use_rope = None¶
- dropout = None¶
- proj_dropout¶
- qkv¶
- proj¶
- forward(x, attn_mask=None, freqs=None)¶
Forward function of the DotProductAttention module.
- Parameters:
x (torch.Tensor) – Tensor to apply self-attention over, shape (batch size, sequence length, hidden_dim).
attn_mask (torch.Tensor | None) – For causal attention (i.e., no attention over the future token) a attention mask should be provided. Defaults to None.
freqs (torch.Tensor | None) – Frequencies for Rotary Positional Embedding (RoPE) of queries/keys. None if use_rope=False.
- Returns:
Returns the output of the attention module.
- Return type:
- class noether.modeling.modules.PerceiverAttention(config)¶
Bases:
torch.nn.ModulePerceiver style attention module. This module is similar to a cross-attention modules.
- Parameters:
config (noether.core.schemas.modules.AttentionConfig) – Configuration for the PerceiverAttention module. See
AttentionConfigfor available options.
- num_heads = None¶
- head_dim¶
- init_weights = None¶
- use_rope = None¶
- kv¶
- q¶
- proj¶
- dropout = None¶
- proj_dropout¶
- forward(q, kv, attn_mask=None, q_freqs=None, k_freqs=None)¶
Forward function of the PerceiverAttention module.
- Parameters:
q (torch.Tensor) – Query tensor, shape (batch size, number of points/tokens, hidden_dim).
kv (torch.Tensor) – Key/value tensor, shape (batch size, number of latent tokens, hidden_dim).
attn_mask (torch.Tensor | None) – When applying causal attention, an attention mask is required. Defaults to None.
q_freqs (torch.Tensor | None) – Frequencies for Rotary Positional Embedding (RoPE) of queries. None if use_rope=False.
k_freqs (torch.Tensor | None) – Frequencies for Rotary Positional Embedding (RoPE) of keys. None if use_rope=False.
- Returns:
Returns the output of the perceiver attention module.
- Return type:
- class noether.modeling.modules.TransolverAttention(config)¶
Bases:
torch.nn.ModuleAdapted from https://github.com/thuml/Transolver/blob/main/Car-Design-ShapeNetCar/models/Transolver.py - Readable reshaping operations via einops - Merged qkv linear layer for higher GPU utilization - F.scaled_dot_product_attention instead of slow pytorch attention - Possibility to mask tokens (required to process variable sized inputs)
- Parameters:
config (noether.core.schemas.modules.AttentionConfig) – Configuration for the Transolver attention module. See
AttentionConfigfor available options.
- num_heads = None¶
- dropout = None¶
- temperature¶
- in_project_x¶
- in_project_fx¶
- in_project_slice¶
- qkv¶
- proj¶
- proj_dropout¶
- create_slices(x, num_input_points, attn_mask=None)¶
Given a set of points, project them to a fixed number of slices using the computed the slice weights per token.
- Parameters:
x (torch.Tensor) – Input tensor with shape (batch_size, num_input_points, hidden_dim).
num_input_points (int) – Number of input points.
attn_mask (torch.Tensor | None) – Mask to mask out certain token for the attention. Defaults to None.
- Returns:
Tensor with the projected slice tokens and the slice weights.
- forward(x, attn_mask=None)¶
Forward pass of the Transolver attention module.
- Parameters:
x (torch.Tensor) – Input tensor with shape (batch_size, seqlen, hidden_dim).
attn_mask (torch.Tensor | None) – Attention mask tensor with shape (batch_size). Defaults to None.
- Returns:
Tensor after applying the transolver attention mechanism.
- class noether.modeling.modules.PerceiverBlock(config)¶
Bases:
torch.nn.ModuleFor a self-attention module, the input tensor for the query, key, and value are the same. The PerceiverBlock, takes different input tensors for the query and the key/value.
- Parameters:
config (noether.core.schemas.modules.blocks.PerceiverBlockConfig) – Configuration of the PerceiverBlock. See
PerceiverBlockConfigoptions. (for available)
- norm1q¶
- norm1kv¶
- attn¶
- ls1¶
- drop_path1¶
- norm2¶
- mlp¶
- ls2¶
- drop_path2¶
- forward(q, kv, condition=None, attn_kwargs=None)¶
Forward pass of the PerceiverBlock.
- Parameters:
q (torch.Tensor) – Input tensor with shape (batch_size, seqlen/num_tokens, hidden_dim) for the query representations.
kv (torch.Tensor) – Input tensor with shape (batch_size, seqlen/num_tokens, hidden_dim) for the key and value representations.
condition (torch.Tensor | None) – Conditioning vector. If provided, the attention and MLP will be scaled, shifted and gated feature-wise with predicted values from this vector.
attn_kwargs (dict[str, Any] | None) – Dict with arguments for the attention (such as the attention mask or rope frequencies). Defaults to None.
- Returns:
Tensor after the forward pass of the PerceiverBlock.
- Return type:
- class noether.modeling.modules.TransformerBlock(config)¶
Bases:
torch.nn.ModuleA transformer block with a single attention layer and a feedforward layer.
- Parameters:
config (noether.core.schemas.modules.blocks.TransformerBlockConfig) – Configuration for the transformer block. See
TransformerBlockConfigfor available options.
- norm1¶
- attention_block¶
- ls1¶
- drop_path1¶
- norm2¶
- mlp¶
- ls2¶
- drop_path2¶
- forward(x, condition=None, attn_kwargs=None)¶
Forward pass of the transformer block.
- Parameters:
x (torch.Tensor) – Input tensor with shape (batch_size, seqlen/num_tokens, hidden_dim).
condition (torch.Tensor | None) – Conditioning vector. If provided, the attention and MLP will be scaled, shifted and gated feature-wise with predicted values from this vector.
attn_kwargs (dict[str, Any] | None) – Dict with arguments for the attention (such as the attention mask or rope frequencies). Defaults to None.
- Returns:
Tensor after the forward pass of the transformer block.
- Return type:
- class noether.modeling.modules.DeepPerceiverDecoder(config)¶
Bases:
torch.nn.ModuleA deep Perceiver decoder module. Can be configured with different number of layers and hidden dimensions. However, it should be noted that this layer is not a full-fledged Perceiver, since it only has a cross-attention mechanism.
- Parameters:
config (noether.core.schemas.modules.decoders.DeepPerceiverDecoderConfig) – Configuration for the DeepPerceiverDecoder module. See
DeepPerceiverDecoderConfigfor available options.
- blocks¶
- forward(kv, queries, attn_kwargs=None, condition=None)¶
Forward pass of the model.
- Parameters:
kv (torch.Tensor) – The key-value tensor (batch_size, num_latent_tokens, dim).
queries (torch.Tensor) – The query tensor (batch_size, num_output_queries, dim).
attn_kwargs (dict[str, Any] | None) – Dict with arguments for the attention (such as the attention mask or rope frequencies). Defaults to None.
condition (torch.Tensor | None) – Optional conditioning tensor that can be used in the attention mechanism. This can be used to pass additional conditioning information, etc.
- Returns:
The predictions as sparse tensor (batch_size * num_output_pos, num_out_values).
- Return type:
- class noether.modeling.modules.SupernodePooling(config)¶
Bases:
torch.nn.ModuleSupernode pooling layer.
The permutation of the supernodes is preserved through the message passing (contrary to the (GP-)UPT code). Additionally, radius is used instead of radius_graph, which is more efficient.
Initialize the SupernodePooling.
- Parameters:
config (noether.core.schemas.modules.encoders.SupernodePoolingConfig) – Configuration for the SupernodePooling module. See
SupernodePoolingConfigfor available options.
- radius¶
- k¶
- max_degree¶
- spool_pos_mode¶
- readd_supernode_pos¶
- aggregation¶
- input_features_dim¶
- pos_embed¶
- output_dim¶
- compute_src_and_dst_indices(input_pos, supernode_idx, batch_idx=None)¶
Compute the source and destination indices for the message passing to the supernodes.
- Parameters:
input_pos (torch.Tensor) – Sparse tensor with shape (batch_size * number of points, 3), representing the input geometries.
supernode_idx (torch.Tensor) – Indexes of the supernodes in the sparse tensor input_pos.
batch_idx (torch.Tensor | None) – 1D tensor, containing the batch index of each entry in input_pos. Default None.
- Returns:
Tuple of (src_idx, dst_idx, local_dst_idx) where src_idx and dst_idx are absolute indices into input_pos and local_dst_idx is a 0-indexed position into supernode_idx (used for scatter_reduce_).
- Return type:
- create_messages(input_pos, src_idx, dst_idx, supernode_idx, input_features=None)¶
Create messages for the message passing to the supernodes, based on different positional encoding representations.
- Parameters:
input_pos (torch.Tensor) – Tensor of shape (batch_size * number_of_points_per_sample, {2,3}), representing the point cloud representation of the input geometry.
src_idx (torch.Tensor) – Index of the source nodes from input_pos.
dst_idx (torch.Tensor) – Source index of the destination nodes from input_pos tensor. These indexes should be the matching supernode indexes.
supernode_idx (torch.Tensor) – Indexes of the node in input_pos that are considered supernodes.
input_features (torch.Tensor | None)
- Raises:
NotImplementedError – Raised if the mode is not implemented. Either “abspos”, “relpos” or “absrelpos” are allowed.
- Returns:
- Tensor with messages for the message passing into the super nodes and the embedding coordinates of the
supernodes.
- Return type:
- accumulate_messages(x, local_dst_idx, supernode_idx)¶
Method to accumulate the messages of neighbouring points into the supernodes.
- Parameters:
x (torch.Tensor) – Tensor containing the message representation of each neighbour representation.
local_dst_idx (torch.Tensor) – 0-indexed position into supernode_idx for each message (no CUDA sync).
supernode_idx (torch.Tensor) – Indexes of the supernode in the input point cloud.
- Returns:
Tensor with the aggregated messages for each supernode.
- Return type:
- forward(input_pos, supernode_idx, batch_idx=None, input_features=None)¶
Forward pass of the supernode pooling layer.
- Parameters:
input_pos (torch.Tensor) – Sparse tensor with shape (batch_size * number_of_points_per_sample, 3), representing the point cloud representation of the input geometry.
supernode_idx (torch.Tensor) – indexes of the supernodes in the sparse tensor input_pos.
batch_idx (torch.Tensor | None) – 1D tensor, containing the batch index of each entry in input_pos. Default None.
input_features (torch.Tensor | None) – Sparse tensor with shape (batch_size * number_of_points_per_sample, number_of_features)
- Returns:
Tensor with the aggregated messages for each supernode.
- Return type:
- class noether.modeling.modules.ContinuousSincosEmbed(config)¶
Bases:
torch.nn.ModuleEmbedding layer for continuous coordinates using sine and cosine functions. The original implementation from the Attenion is All You Need paper, deals with descrete 1D cordinates (i.e., a sequence). Howerver, this implementation is able to deal with 2D and 3D coordinate systems as well.
- Parameters:
config (noether.core.schemas.modules.layers.ContinuousSincosEmbeddingConfig) – Configuration for the ContinuousSincosEmbed module. See
ContinuousSincosEmbeddingConfigfor the available options.
- omega: torch.Tensor¶
- padding_tensor: torch.Tensor¶
- input_dim¶
- ndim_padding¶
- sincos_padding¶
- max_wavelength¶
- padding¶
- forward(coords)¶
Forward method of the ContinuousSincosEmbed layer.
- Parameters:
coords (torch.Tensor) – Tensor of coordinates. The shape of the tensor should be [batch size, number of points, coordinate dimension] or [number of points, coordinate dimension].
- Raises:
NotImplementedError – Only supports sparse (i.e. [number of points, coordinate dimension]) or dense (i.e. [batch size, number of points, coordinate dimension]) coordinates systems.
- Returns:
Tensor with embedded coordinates.
- Return type:
- class noether.modeling.modules.LayerScale(config)¶
Bases:
torch.nn.ModuleLayerScale module scales the input tensor by a learnable parameter gamma.
Initialize the LayerScale module. :param config: Configuration for the LayerScale module. See
LayerScaleConfigfor details.- Parameters:
config (noether.core.schemas.modules.layers.LayerScaleConfig)
- forward(x)¶
Forward function of the LayerScale module.
- Parameters:
x (torch.Tensor) – Input tensor to be scaled.
- Returns:
Tensor scaled by the gamma parameter.
- Return type:
- class noether.modeling.modules.LinearProjection(config)¶
Bases:
torch.nn.ModuleLinearProjection is a linear projection layer that can be used for 1D, 2D, and 3D data.
- Parameters:
config (noether.core.schemas.modules.layers.LinearProjectionConfig) – The configuration of the LinearProjection. See
LinearProjectionConfigfor available options.- Raises:
NotImplementedError – raises not implemented error if the number of dimensions of the input domain is bigger than 4.
- project: torch.nn.Linear | torch.nn.Conv1d | torch.nn.Conv2d | torch.nn.Conv3d | torch.nn.Identity¶
- init_weights¶
- reset_parameters()¶
- Reset the parameters of the MLP with a specific initialization. Options are “torch” (i.e., default) or
“truncnormal002”.
- Raises:
NotImplementedError – raised if the specified initialization is not implemented.
- Return type:
None
- forward(x)¶
Forward function of the LinearProjection.
- Parameters:
x (torch.Tensor) – Input tensor to the LinearProjection.
- Returns:
Output tensor from the LinearProjection.
- Return type:
- class noether.modeling.modules.UnquantizedDropPath(config)¶
Bases:
torch.nn.ModuleUnquantized drop path (Stochastic Depth, https://arxiv.org/abs/1603.09382) per sample. Unquantized means that dropped paths are still calculated. Number of dropped paths is fully stochastic, i.e., it can happen that not a single path is dropped or that all paths are dropped. In a quantized drop path, the same amount of paths are dropped in each forward pass, resulting in large speedups with high drop_prob values. See https://arxiv.org/abs/2212.04884 for more discussion. UnquantizedDropPath does not provide any speedup, consider using a quantized version if large drop_prob values are used.
Adapted from https://github.com/huggingface/pytorch-image-models/blob/main/timm/layers/drop.py#L150
Initialize the UnquantizedDropPath module.
- Parameters:
config (noether.core.schemas.modules.layers.UnquantizedDropPathConfig) – Configuration for the UnquantizedDropPath module. See
UnquantizedDropPathConfigfor the available options.
- drop_prob¶
- scale_by_keep¶
- property keep_prob¶
Return the keep probability. I.e. the probability to keep a path, which is 1 - drop_prob.
- Returns:
Float value of the keep probability.
- forward(x)¶
Forward function of the UnquantizedDropPath module.
- Parameters:
x (torch.Tensor) – Tensor to apply the drop path. Shape: (batch_size, …).
- Returns:
(batch_size, …). If drop_prob is 0, the input tensor is returned. If drop_prob is 1, a tensor with zeros is returned.
- Return type:
Tensor with drop path applied. Shape
- extra_repr()¶
Extra representation of the UnquantizedDropPath module.
- Returns:
Return a string representation of the module.
- class noether.modeling.modules.MLP(config)¶
Bases:
torch.nn.ModuleImplements a Multi-Layer Perceptron (MLP) with configurable number of layers, hidden dimension activation functions and weight initialization methods. Only one hidden dimension is supported for simplicity, i.e., all hidden layers have the same dimension. The MLP will always have one input layer and one output layer. When num_layers=0, the MLP is a two layer network with one non-linearity in between. When num_layers>=1, the MLP has additional hidden layers, etc.
Initialize the MLP.
- Parameters:
config (noether.core.schemas.modules.mlp.MLPConfig) – Configuration object for the MLP. See
MLPConfigfor available options.
- init_weights¶
- mlp¶
- reset_parameters()¶
- Reset the parameters of the MLP with a specific initialization. Options are “torch” (i.e., default), or
“truncnormal002”.
- Raises:
NotImplementedError – raised if the specified initialization is not implemented.
- Return type:
None
- forward(x)¶
Forward function of the MLP.
- Parameters:
x (torch.Tensor) – Input tensor to the MLP.
- Returns:
Output tensor from the MLP.
- Return type:
- class noether.modeling.modules.UpActDownMlp(config)¶
Bases:
torch.nn.ModuleUpActDownMlp is a vanilla MLP with an up-projection followed by an GELU activation function and a down-projection to the original input dim.
Initialize the UpActDownMlp.
- Parameters:
config (noether.core.schemas.modules.mlp.UpActDownMLPConfig) – The configuration of the UpActDownMlp.
- init_weights¶
- fc1¶
- act¶
- fc2¶
- reset_parameters()¶
- Reset the parameters of the MLP with a specific initialization. Options are “torch” (i.e., default), or
“truncnormal002”.
- Raises:
NotImplementedError – raised if the specified initialization is not implemented.
- Return type:
None
- forward(x)¶
Forward function of the UpActDownMlp.
- Parameters:
x (torch.Tensor) – Input tensor to the MLP.
- Returns:
Output tensor from the MLP.
- Return type: