noether.core.models¶
Submodules¶
Classes¶
Base class for models ( |
|
A composite model that consists of multiple submodels of type Model. By having multiple submodels, each model can have its own optimizer and learning rate scheduler, from weights etc. |
|
Model class that should be extended by all custom models. |
Package Contents¶
- class noether.core.models.ModelBase(model_config, update_counter=None, path_provider=None, data_container=None, initializer_config=None)¶
Bases:
torch.nn.ModuleBase class for models (
ModelandCompositeModel) that is used to define the interface for all models trainable by theBaseTrainer.Provides methods to initialize the model weights and setup (model-specific) optimizers.
- Parameters:
model_config (noether.core.schemas.models.ModelBaseConfig) – Model configuration. See
ModelBaseConfigfor available options.update_counter (noether.core.utils.training.counter.UpdateCounter | None) – The
UpdateCounterprovided to the optimizer.path_provider (noether.core.providers.PathProvider | None) – A path
PathProviderused by the initializer to store or retrieve checkpoints.data_container (noether.data.container.DataContainer | None) – The
DataContainerwhich includes the data and dataloader. This is currently unused but helpful for quick prototyping only, evaluating forward in debug mode, etc.initializer_config (list[noether.core.schemas.initializers.InitializerConfig] | None) – The initializer config used to initialize the model e.g. from a checkpoint.
- logger¶
- name¶
- update_counter = None¶
- path_provider = None¶
- data_container = None¶
- initializers: list[noether.core.initializers.InitializerBase] = []¶
- model_config¶
- is_initialized = False¶
- property optimizer: noether.core.optimizer.OptimizerWrapper | None¶
- Return type:
- property device: torch.device¶
- Abstractmethod:
- Return type:
- property trainable_param_count: int¶
Returns the number of parameters that require gradients (i.e., are trainable).
- Return type:
- property frozen_param_count: int¶
Returns the number of parameters that do not require gradients (i.e., are frozen).
- Return type:
- property nograd_paramnames: list[str]¶
Returns a list of parameter names that do not have gradients (i.e., grad is None) but require gradients.
- initialize()¶
Initializes weights and optimizer parameters of the model.
- abstractmethod get_named_models()¶
Returns a dict of {model_name: model}, e.g., to log all learning rates of all models/submodels.
- abstractmethod initialize_weights()¶
Initialize the weights of the model.
- Return type:
Self
- abstractmethod apply_initializers()¶
Apply the initializers to the model.
- Return type:
Self
- abstractmethod initialize_optimizer()¶
Initialize the optimizer of the model.
- Return type:
None
- abstractmethod optimizer_step(grad_scaler)¶
Perform an optimization step.
- Parameters:
grad_scaler (torch.amp.grad_scaler.GradScaler | None)
- Return type:
None
- abstractmethod optimizer_schedule_step()¶
Perform the optimizer learning rate scheduler step.
- Return type:
None
- class noether.core.models.CompositeModel(model_config, update_counter=None, path_provider=None, data_container=None)¶
Bases:
noether.core.models.base.ModelBaseA composite model that consists of multiple submodels of type Model. By having multiple submodels, each model can have its own optimizer and learning rate scheduler, from weights etc. This is useful for multi-component models,
A composite model must implement the submodels property, which returns a dictionary of submodel names to submodel instances.
Example code (dummy code):
from noether.core.models.composite import CompositeModel from somewhere import MyModel1, MyModel2 class MyCompositeModel(CompositeModel): def __init__(self, model_config: MyCompositeModelConfig, update_counter: UpdateCounter | None = None, path_provider: PathProvider | None = None, data_container: DataContainer | None = None, static_context: dict[str, Any] | None = None): super().__init__(model_config, ...) self.submodel1 = MyModel1( model_config=model_config.submodel1_config, is_frozen=model_config.is_frozen, update_counter=update_counter, path_provider=path_provider, data_container=data_container, static_context=static_context, optimizer_config=model_config.submodel1_config.optimizer_config, ) self.submodel2 = MyModel2(model_config=model_config.submodel2_config, ... ) @property def submodels(self) -> dict[str, Model]: return dict( submodel1=self.submodel1, submodel2=self.submodel2, ) def forward(self, x: torch.Tensor) -> torch.Tensor: # define forward pass here using self.submodel1 and self.submodel2 x = self.submodel1(x) x = self.submodel2(x) return x
Base class for composite models, i.e. models that consist of multiple submodels of type Model.
- Parameters:
model_config (noether.core.schemas.models.ModelBaseConfig)
update_counter (noether.core.utils.training.UpdateCounter | None)
path_provider (noether.core.providers.path.PathProvider | None)
data_container (noether.data.container.DataContainer | None)
- property submodels: dict[str, noether.core.models.base.ModelBase]¶
- Abstractmethod:
- Return type:
Returns the submodels of the composite model. This method must be implemented by the subclass, otherwise a NotImplementedError is raised.
- get_named_models()¶
Returns a dict of {model_name: model}, e.g., to log all learning rates of all models/submodels.
- Return type:
- property device: torch.device¶
- Return type:
- initialize_weights()¶
Initialize the weights of the model, calling the initializer of all submodules.
- Return type:
Self
- apply_initializers()¶
Apply the initializers to the model, calling the initializer of all submodules.
- Return type:
Self
- initialize_optimizer()¶
Initialize the optimizer of the model.
- Return type:
None
- optimizer_step(grad_scaler)¶
Perform an optimization step, calling all submodules’ optimization steps.
- Parameters:
grad_scaler (torch.amp.grad_scaler.GradScaler | None)
- Return type:
None
- optimizer_schedule_step()¶
Perform the optimizer learning rate scheduler step, calling all submodules’ scheduler steps.
- Return type:
None
- optimizer_zero_grad(set_to_none=True)¶
Zero the gradients of the optimizer, calling all submodules’ zero_grad methods.
- Parameters:
set_to_none (bool)
- Return type:
None
- train(mode=True)¶
Set the model to train or eval mode.
Overwrites the nn.Module.train method to avoid setting the model to train mode if it is frozen and to call all submodules’ train methods.
- Parameters:
mode – If True, set the model to train mode. If False, set the model to eval mode.
- Return type:
Self
- to(device, *args, **kwargs)¶
Performs Tensor dtype and/or device conversion, calling all submodules’ to methods.
- Parameters:
device – The desired device of the tensor. Can be a string (e.g. “cuda:0”) or “cpu”.
- Return type:
Self
- class noether.core.models.Model(model_config, is_frozen=False, update_counter=None, path_provider=None, data_container=None)¶
Bases:
noether.core.models.base.ModelBaseModel class that should be extended by all custom models. Each model has its own optimizer and learning rate scheduler, which are initialized in the initialize_optimizer method.
Example code (dummy code):
from noether.core.models.single import Model from noether.core.schemas.models import ModelBaseConfig class MyModelConfig(ModelBaseConfig): kind: path.to.MyModel name: my_model optimizer_config: kind: torch.optim.AdamW lr: 1.0e-3 weight_decay: 0.05 clip_grad_norm: 1.0 schedule_config: kind: noether.core.schedules.LinearWarmupCosineDecaySchedule warmup_percent: 0.05 end_value: 1.0e-6 max_value: ${model.optimizer_config.lr} input_dim: int = 128 hidden_dim: int = 256 output_dim: int = 10 class MyModel(Model): def __init__(self, model_config: MyModelConfig, ...): super().__init__(model_config, ...) self.layer1 = torch.nn.Linear(self.model_config.input_dim, self.model_config.hidden_dim) self.layer2 = torch.nn.Linear(self.model_config.hidden_dim, self.model_config.output_dim def forward(self, x: torch.Tensor) -> torch.Tensor: # define forward pass here x = self.layer1(x) x = torch.relu(x) x = self.layer2(x) return x
Base class for single models, i.e. one model with one optimizer as opposed to CompositeModel.
- Parameters:
model_config (noether.core.schemas.models.ModelBaseConfig) – Model configuration. See
ModelBaseConfigfor available options.update_counter (noether.core.utils.training.UpdateCounter | None) – The
UpdateCounterprovided to the optimizer.is_frozen (bool) – If true, will set requires_grad of all parameters to false. Will also put the model into eval mode (e.g., to put Dropout or BatchNorm into eval mode).
path_provider (noether.core.providers.PathProvider | None) –
PathProviderused by the initializer to store or retrieve checkpoints.data_container (noether.data.container.DataContainer | None) –
DataContainerwhich includes the data and dataloader. This is currently unused but helpful for quick prototyping only, evaluating forward in debug mode, etc.
- property device: torch.device¶
- Return type:
- get_named_models()¶
Returns a dict of {model_name: model}, e.g., to log all learning rates of all models/submodels.
- Return type:
- initialize_weights()¶
Freezes the weights of the model by setting requires_grad to False if self.is_frozen is True.
- Return type:
Self
- apply_initializers()¶
Apply the initializers to the model, calling initializer.init_weights and initializer.init_optim.
- Return type:
Self
- initialize_optimizer()¶
Initialize the optimizer.
- Return type:
None
- optimizer_step(grad_scaler)¶
Perform an optimization step.
- Parameters:
grad_scaler (torch.amp.grad_scaler.GradScaler | None)
- Return type:
None
- optimizer_schedule_step()¶
Perform the optimizer learning rate scheduler step.
- Return type:
None
- optimizer_zero_grad(set_to_none=True)¶
Zero the gradients of the optimizer.
- Parameters:
set_to_none (bool)
- Return type:
None
- train(mode=True)¶
Set the model to train or eval mode.
Overwrites the nn.Module.train method to avoid setting the model to train mode if it is frozen.
- Parameters:
mode (bool) – If True, set the model to train mode. If False, set the model to eval mode.
- Return type:
Self
- to(device, *args, **kwargs)¶
Performs Tensor dtype and/or device conversion, overwriting nn.Module.to method to set the _device attribute.
- Parameters:
device (str | torch.device | int | None) – The desired device of the tensor. Can be a string (e.g. “cuda:0”) or “cpu”.
- Return type:
Self