noether.modeling.modules.attention¶
Submodules¶
Attributes¶
Classes¶
Scaled dot-product attention module. |
|
Perceiver style attention module. This module is similar to a cross-attention modules. |
|
Adapted from https://github.com/thuml/Transolver/blob/main/Car-Design-ShapeNetCar/models/Transolver.py |
|
Transolver++ Attention module as implemented in https://github.com/thuml/Transolver_plus/blob/main/models/Transolver_plus.py |
Package Contents¶
- class noether.modeling.modules.attention.DotProductAttention(config)¶
Bases:
torch.nn.ModuleScaled dot-product attention module.
- Parameters:
config (noether.core.schemas.modules.AttentionConfig) – Configuration for the DotProductAttention module. See
AttentionConfigfor available options.
- num_heads = None¶
- head_dim¶
- init_weights = None¶
- use_rope = None¶
- dropout = None¶
- proj_dropout¶
- qkv¶
- proj¶
- forward(x, attn_mask=None, freqs=None)¶
Forward function of the DotProductAttention module.
- Parameters:
x (torch.Tensor) – Tensor to apply self-attention over, shape (batch size, sequence length, hidden_dim).
attn_mask (torch.Tensor | None) – For causal attention (i.e., no attention over the future token) a attention mask should be provided. Defaults to None.
freqs (torch.Tensor | None) – Frequencies for Rotary Positional Embedding (RoPE) of queries/keys. None if use_rope=False.
- Returns:
Returns the output of the attention module.
- Return type:
- class noether.modeling.modules.attention.PerceiverAttention(config)¶
Bases:
torch.nn.ModulePerceiver style attention module. This module is similar to a cross-attention modules.
- Parameters:
config (noether.core.schemas.modules.AttentionConfig) – Configuration for the PerceiverAttention module. See
AttentionConfigfor available options.
- num_heads = None¶
- head_dim¶
- init_weights = None¶
- use_rope = None¶
- kv¶
- q¶
- proj¶
- dropout = None¶
- proj_dropout¶
- forward(q, kv, attn_mask=None, q_freqs=None, k_freqs=None)¶
Forward function of the PerceiverAttention module.
- Parameters:
q (torch.Tensor) – Query tensor, shape (batch size, number of points/tokens, hidden_dim).
kv (torch.Tensor) – Key/value tensor, shape (batch size, number of latent tokens, hidden_dim).
attn_mask (torch.Tensor | None) – When applying causal attention, an attention mask is required. Defaults to None.
q_freqs (torch.Tensor | None) – Frequencies for Rotary Positional Embedding (RoPE) of queries. None if use_rope=False.
k_freqs (torch.Tensor | None) – Frequencies for Rotary Positional Embedding (RoPE) of keys. None if use_rope=False.
- Returns:
Returns the output of the perceiver attention module.
- Return type:
- class noether.modeling.modules.attention.TransolverAttention(config)¶
Bases:
torch.nn.ModuleAdapted from https://github.com/thuml/Transolver/blob/main/Car-Design-ShapeNetCar/models/Transolver.py - Readable reshaping operations via einops - Merged qkv linear layer for higher GPU utilization - F.scaled_dot_product_attention instead of slow pytorch attention - Possibility to mask tokens (required to process variable sized inputs)
- Parameters:
config (noether.core.schemas.modules.AttentionConfig) – Configuration for the Transolver attention module. See
AttentionConfigfor available options.
- num_heads = None¶
- dropout = None¶
- temperature¶
- in_project_x¶
- in_project_fx¶
- in_project_slice¶
- qkv¶
- proj¶
- proj_dropout¶
- create_slices(x, num_input_points, attn_mask=None)¶
Given a set of points, project them to a fixed number of slices using the computed the slice weights per token.
- Parameters:
x (torch.Tensor) – Input tensor with shape (batch_size, num_input_points, hidden_dim).
num_input_points (int) – Number of input points.
attn_mask (torch.Tensor | None) – Mask to mask out certain token for the attention. Defaults to None.
- Returns:
Tensor with the projected slice tokens and the slice weights.
- forward(x, attn_mask=None)¶
Forward pass of the Transolver attention module.
- Parameters:
x (torch.Tensor) – Input tensor with shape (batch_size, seqlen, hidden_dim).
attn_mask (torch.Tensor | None) – Attention mask tensor with shape (batch_size). Defaults to None.
- Returns:
Tensor after applying the transolver attention mechanism.
- class noether.modeling.modules.attention.TransolverPlusPlusAttention(config)¶
Bases:
torch.nn.ModuleTransolver++ Attention module as implemented in https://github.com/thuml/Transolver_plus/blob/main/models/Transolver_plus.py
- Parameters:
config (noether.core.schemas.modules.AttentionConfig) – Configuration for the TransolverPlusPlusAttention module. See
AttentionConfigfor available options.
- dim_head¶
- num_heads = None¶
- scale¶
- softmax¶
- dropout = None¶
- bias¶
- proj_temperature¶
- in_project_x¶
- in_project_slice¶
- qkv¶
- to_out¶
- forward(x, attn_mask=None)¶
Forward pass of the Transolver attention module.
- Parameters:
x (torch.Tensor) – Input tensor with shape (batch_size, seqlen, hidden_dim).
attn_mask (torch.Tensor | None) – Attention mask tensor with shape (batch_size). Defaults to None.
- Returns:
Tensor after applying the transolver attention mechanism.
- noether.modeling.modules.attention.ATTENTION_REGISTRY: dict[str, type[torch.nn.Module]]¶