noether.modeling.modules.attention.transolver_plusplus

Classes

TransolverPlusPlusAttentionConfig

Configuration for the Transolver++ attention module.

TransolverPlusPlusAttention

Transolver++ Attention module as implemented in https://github.com/thuml/Transolver_plus/blob/main/models/Transolver_plus.py

Module Contents

class noether.modeling.modules.attention.transolver_plusplus.TransolverPlusPlusAttentionConfig(/, **data)

Bases: noether.modeling.modules.attention.TransolverAttentionConfig

Configuration for the Transolver++ attention module.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

data (Any)

use_overparameterization: bool = None

Whether to use overparameterization for the slice projection.

use_adaptive_temperature: bool = None

Whether to use an adaptive temperature for the slice selection.

temperature_activation: Literal['sigmoid', 'softplus', 'exp'] | None = None

Activation function for the adaptive temperature.

use_gumbel_softmax: bool = None

Whether to use Gumbel-Softmax for the slice selection.

class noether.modeling.modules.attention.transolver_plusplus.TransolverPlusPlusAttention(config)

Bases: torch.nn.Module

Transolver++ Attention module as implemented in https://github.com/thuml/Transolver_plus/blob/main/models/Transolver_plus.py

Parameters:

config (noether.core.schemas.modules.attention.AttentionConfig) – Configuration for the TransolverPlusPlusAttention module. See AttentionConfig for available options.

dim_head
num_heads = None
scale
softmax
dropout = None
bias
proj_temperature
in_project_x
in_project_slice
q
k
v
to_out
forward(x, attn_mask=None)

Forward pass of the Transolver attention module.

Parameters:
  • x (torch.Tensor) – Input tensor with shape (batch_size, seqlen, hidden_dim).

  • attn_mask (torch.Tensor | None) – Attention mask tensor with shape (batch_size). Defaults to None.

Returns:

Tensor after applying the transolver attention mechanism.