noether.core.models¶

Submodules¶

Classes¶

`ModelBase`	Base class for models (`Model` and `CompositeModel`) that is used to define the interface for all models trainable by the `BaseTrainer`.
`ModelBaseConfig`	Internal base class for all registry-based configs.
`CompositeModel`	A composite model that consists of multiple submodels of type Model. By having multiple submodels, each model can have its own optimizer and learning rate scheduler, from weights etc.
`Model`	Model class that should be extended by all custom models.

Package Contents¶

class noether.core.models.ModelBase(model_config, update_counter=None, path_provider=None, data_container=None, initializer_config=None)¶

Bases: torch.nn.Module

Base class for models (Model and CompositeModel) that is used to define the interface for all models trainable by the BaseTrainer.

Provides methods to initialize the model weights and setup (model-specific) optimizers.

Parameters:

model_config (ModelBaseConfig) – Model configuration. See ModelBaseConfig for available options.
update_counter (noether.core.utils.training.counter.UpdateCounter | None) – The UpdateCounter provided to the optimizer.
path_provider (noether.core.providers.PathProvider | None) – A path PathProvider used by the initializer to store or retrieve checkpoints.
data_container (noether.data.container.DataContainer | None) – The DataContainer which includes the data and dataloader. This is currently unused but helpful for quick prototyping only, evaluating forward in debug mode, etc.
initializer_config (list[noether.core.initializers.InitializerConfig] | None) – The initializer config used to initialize the model e.g. from a checkpoint.

logger¶

name¶

update_counter = None¶

path_provider = None¶

data_container = None¶

initializers: list[noether.core.initializers.InitializerBase] = []¶

model_config¶

is_initialized = False¶

property optimizer: noether.core.optimizer.OptimizerWrapper | None¶

Return type:: noether.core.optimizer.OptimizerWrapper | None

property device: torch.device¶

Abstractmethod:
Return type:: torch.device

property is_frozen: bool¶

Abstractmethod:
Return type:: bool

property param_count: int¶

Returns the total number of parameters in the model.

Return type:: int

property trainable_param_count: int¶

Returns the number of parameters that require gradients (i.e., are trainable).

Return type:: int

property frozen_param_count: int¶

Returns the number of parameters that do not require gradients (i.e., are frozen).

Return type:: int

property nograd_paramnames: list[str]¶

Returns a list of parameter names that do not have gradients (i.e., grad is None) but require gradients.

Return type:: list[str]

initialize()¶: Initializes weights and optimizer parameters of the model.

abstractmethod get_named_models()¶

Returns a dict of {model_name: model}, e.g., to log all learning rates of all models/submodels.

Return type:: dict[str, ModelBase]

abstractmethod initialize_weights()¶

Initialize the weights of the model.

Return type:: Self

abstractmethod apply_initializers()¶

Apply the initializers to the model.

Return type:: Self

abstractmethod initialize_optimizer()¶

Initialize the optimizer of the model.

Return type:: None

abstractmethod optimizer_step(grad_scaler)¶

Perform an optimization step.

Parameters:: grad_scaler (torch.amp.grad_scaler.GradScaler | None)
Return type:: None

abstractmethod optimizer_schedule_step()¶

Perform the optimizer learning rate scheduler step.

Return type:: None

abstractmethod optimizer_zero_grad(set_to_none=True)¶

Zero the gradients of the optimizer.

Parameters:: set_to_none (bool)
Return type:: None

class noether.core.models.ModelBaseConfig(/, **data)¶

Bases: noether.core.schemas.lib._RegistryBase

Internal base class for all registry-based configs.

Provides auto-registration via __init_subclass__. Not meant to be used directly - use specific config base classes instead.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:: data (Any)

kind: str | None = None¶: Kind of model to use, i.e. class path

name: str¶: Name of the model. Needs to be unique

optimizer_config: noether.core.optimizer.schemas.AnyOptimizerConfig | None = None¶: The optimizer configuration to use for training the model. When a model is used for inference only, this can be left as None.

initializers: list[Annotated[noether.core.initializers.AnyInitializer, Field(discriminator=kind)]] | None = None¶: List of initializers configs to use for the model.

is_frozen: bool | None = False¶: Whether to freeze the model parameters (i.e., not trainable).

forward_properties: list[str] | None = []¶: List of properties to be used as inputs for the forward pass of the model. Only relevant when the train_step of the BaseTrainer is used. When overridden in a class method, this property is ignored.

model_config¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

property config_kind: str¶

The fully qualified import path for the configuration class.

Return type:: str

class noether.core.models.CompositeModel(model_config, update_counter=None, path_provider=None, data_container=None)¶

Bases: noether.core.models.base.ModelBase

A composite model that consists of multiple submodels of type Model. By having multiple submodels, each model can have its own optimizer and learning rate scheduler, from weights etc. This is useful for multi-component models,

A composite model must implement the submodels property, which returns a dictionary of submodel names to submodel instances.

Example code (dummy code):

from noether.core.models.composite import CompositeModel
from somewhere import MyModel1, MyModel2

class MyCompositeModel(CompositeModel):
    def __init__(self, model_config: MyCompositeModelConfig, update_counter: UpdateCounter | None = None, path_provider: PathProvider | None = None, data_container: DataContainer | None = None, static_context: dict[str, Any] | None = None):
        super().__init__(model_config, ...)

        self.submodel1 = MyModel1(
            model_config=model_config.submodel1_config,
            is_frozen=model_config.is_frozen,
            update_counter=update_counter,
            path_provider=path_provider,
            data_container=data_container,
            static_context=static_context,
            optimizer_config=model_config.submodel1_config.optimizer_config,
        )
        self.submodel2 = MyModel2(model_config=model_config.submodel2_config, ... )

    @property
    def submodels(self) -> dict[str, Model]:
        return dict(
            submodel1=self.submodel1,
            submodel2=self.submodel2,
        )

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        # define forward pass here using self.submodel1 and self.submodel2
        x = self.submodel1(x)
        x = self.submodel2(x)
        return x

Base class for composite models, i.e. models that consist of multiple submodels of type Model.

Parameters:

model_config (noether.core.models.base.ModelBaseConfig)
update_counter (noether.core.utils.training.UpdateCounter | None)
path_provider (noether.core.providers.path.PathProvider | None)
data_container (noether.data.container.DataContainer | None)

property submodels: dict[str, noether.core.models.base.ModelBase]¶

Abstractmethod:
Return type:: dict[str, noether.core.models.base.ModelBase]

Returns the submodels of the composite model. This method must be implemented by the subclass, otherwise a NotImplementedError is raised.

get_named_models()¶

Returns a dict of {model_name: model}, e.g., to log all learning rates of all models/submodels.

Return type:: dict[str, noether.core.models.base.ModelBase]

property device: torch.device¶

Return type:: torch.device

initialize_weights()¶

Initialize the weights of the model, calling the initializer of all submodules.

Return type:: Self

apply_initializers()¶

Apply the initializers to the model, calling the initializer of all submodules.

Return type:: Self

initialize_optimizer()¶

Initialize the optimizer of the model.

Return type:: None

optimizer_step(grad_scaler)¶

Perform an optimization step, calling all submodules’ optimization steps.

Parameters:: grad_scaler (torch.amp.grad_scaler.GradScaler | None)
Return type:: None

optimizer_schedule_step()¶

Perform the optimizer learning rate scheduler step, calling all submodules’ scheduler steps.

Return type:: None

optimizer_zero_grad(set_to_none=True)¶

Zero the gradients of the optimizer, calling all submodules’ zero_grad methods.

Parameters:: set_to_none (bool)
Return type:: None

property is_frozen: bool¶

Return type:: bool

train(mode=True)¶

Set the model to train or eval mode.

Overwrites the nn.Module.train method to avoid setting the model to train mode if it is frozen and to call all submodules’ train methods.

Parameters:: mode – If True, set the model to train mode. If False, set the model to eval mode.
Return type:: Self

to(device, *args, **kwargs)¶

Performs Tensor dtype and/or device conversion, calling all submodules’ to methods.

Parameters:: device – The desired device of the tensor. Can be a string (e.g. “cuda:0”) or “cpu”.
Return type:: Self

class noether.core.models.Model(model_config, is_frozen=False, update_counter=None, path_provider=None, data_container=None)¶

Bases: noether.core.models.ModelBase

Model class that should be extended by all custom models. Each model has its own optimizer and learning rate scheduler, which are initialized in the initialize_optimizer method.

Example code (dummy code):

from noether.core.models.single import Model
from noether.core.models.base import ModelBaseConfig

class MyModelConfig(ModelBaseConfig):
    kind: path.to.MyModel
    name: my_model
    optimizer_config:
        kind: torch.optim.AdamW
        lr: 1.0e-3
        weight_decay: 0.05
        clip_grad_norm: 1.0
        schedule_config:
            kind: noether.core.schedules.LinearWarmupCosineDecaySchedule
            warmup_percent: 0.05
            end_value: 1.0e-6
            max_value: ${model.optimizer_config.lr}

    input_dim: int = 128
    hidden_dim: int = 256
    output_dim: int = 10

class MyModel(Model):
    def __init__(self, model_config: MyModelConfig, ...):
        super().__init__(model_config, ...)

        self.layer1 = torch.nn.Linear(self.model_config.input_dim, self.model_config.hidden_dim)
        self.layer2 = torch.nn.Linear(self.model_config.hidden_dim, self.model_config.output_dim

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        # define forward pass here
        x = self.layer1(x)
        x = torch.relu(x)
        x = self.layer2(x)
        return x

Base class for single models, i.e. one model with one optimizer as opposed to CompositeModel.

Parameters:

model_config (noether.core.models.ModelBaseConfig) – Model configuration. See ModelBaseConfig for available options.
update_counter (noether.core.utils.training.UpdateCounter | None) – The UpdateCounter provided to the optimizer.
is_frozen (bool) – If true, will set requires_grad of all parameters to false. Will also put the model into eval mode (e.g., to put Dropout or BatchNorm into eval mode).
path_provider (noether.core.providers.PathProvider | None) – PathProvider used by the initializer to store or retrieve checkpoints.
data_container (noether.data.container.DataContainer | None) – DataContainer which includes the data and dataloader. This is currently unused but helpful for quick prototyping only, evaluating forward in debug mode, etc.

property is_frozen: bool¶

Return type:: bool

property device: torch.device¶

Return type:: torch.device

get_named_models()¶

Returns a dict of {model_name: model}, e.g., to log all learning rates of all models/submodels.

Return type:: dict[str, noether.core.models.ModelBase]

initialize_weights()¶

Freezes the weights of the model by setting requires_grad to False if self.is_frozen is True.

Return type:: Self

apply_initializers()¶

Apply the initializers to the model, calling initializer.init_weights and initializer.init_optim.

Return type:: Self

initialize_optimizer()¶

Initialize the optimizer.

Return type:: None

optimizer_step(grad_scaler)¶

Perform an optimization step.

Parameters:: grad_scaler (torch.amp.grad_scaler.GradScaler | None)
Return type:: None

optimizer_schedule_step()¶

Perform the optimizer learning rate scheduler step.

Return type:: None

optimizer_zero_grad(set_to_none=True)¶

Zero the gradients of the optimizer.

Parameters:: set_to_none (bool)
Return type:: None

train(mode=True)¶

Set the model to train or eval mode.

Overwrites the nn.Module.train method to avoid setting the model to train mode if it is frozen.

Parameters:: mode (bool) – If True, set the model to train mode. If False, set the model to eval mode.
Return type:: Self

to(device, *args, **kwargs)¶

Performs Tensor dtype and/or device conversion, overwriting nn.Module.to method to set the _device attribute.

Parameters:: device (str | torch.device | int | None) – The desired device of the tensor. Can be a string (e.g. “cuda:0”) or “cpu”.
Return type:: Self