noether.core.schemas.trainers¶
Classes¶
Module Contents¶
- class noether.core.schemas.trainers.CheckpointConfig(/, **data)¶
Bases:
pydantic.BaseModel- Parameters:
data (Any)
- class noether.core.schemas.trainers.BaseTrainerConfig(/, **data)¶
Bases:
pydantic.BaseModel- Parameters:
data (Any)
- max_epochs: int | None = None¶
The maximum number of epochs to train for. Mutually exclusive with max_updates and max_samples. If set to 0, training will be skipped and all callbacks will be invoked once (useful for evaluation-only runs).
- max_updates: int | None = None¶
The maximum number of updates to train for. Mutually exclusive with max_epochs and max_samples. If set to 0, training will be skipped and all callbacks will be invoked once (useful for evaluation-only runs).
- max_samples: int | None = None¶
The maximum number of samples to train for. Mutually exclusive with max_epochs and max_updates. If set to 0, training will be skipped and all callbacks will be invoked once (useful for evaluation-only runs).
- add_default_callbacks: bool | None = None¶
Whether to add default callbacks. Default callbacks log things like simple dataset statistics or the current value of the learning rate if it is scheduled.
- add_trainer_callbacks: bool | None = None¶
Whether to add trainer specific callbacks (e.g., a callback to log the training accuracy for a classification task).
- effective_batch_size: int = None¶
the “global batch size”. In multi-GPU setups, the batch size per device, (“local batch size”) is effective_batch_size / number of devices. If gradient accumulation is used, the forward-pass batch size is derived by dividing by the number of gradient accumulation steps.
- Type:
The effective batch size used for optimization. This is the number of samples that are processed before an update step is taken
- precision: Literal['float32', 'fp32', 'float16', 'fp16', 'bfloat16', 'bf16'] = None¶
The precision to use for training (e.g., “float32”). Mixed precision training (e.g., “float16” or “bfloat16”) can be used to speed up training and reduce memory usage on supported hardware (e.g., NVIDIA GPUs).
- callbacks: list[noether.core.schemas.callbacks.CallbacksConfig] | None = None¶
The callbacks to use for training.
- initializer: noether.core.schemas.initializers.InitializerConfig | None = None¶
The initializer to use for training. Mainly used for resuming training via ResumeInitializer.
- track_every_n_epochs: int | None = None¶
The integer number of epochs to to periodically track metrics at.
- track_every_n_updates: int | None = None¶
The integer number of updates to periodically track metrics at.
- track_every_n_samples: int | None = None¶
The integer number of samples to periodically track metrics at.
- max_batch_size: int | None = None¶
The maximum batch size to use for model forward pass in training. If the effective_batch_size is larger than max_batch_size, gradient accumulation will be used to simulate the larger batch size. For example, if effective_batch_size=8 and max_batch_size=2, 4 gradient accumulation steps will be taken before each optimizer step.
- skip_nan_loss: bool = None¶
Whether to skip NaN losses. These can sometimes occur due to unlucky coincidences. If true, NaN losses will be skipped without terminating the training up until 100 NaN losses occurred in a row.
- disable_gradient_accumulation: bool = None¶
Whether to disable gradient accumulation. Gradient accumulation is sometimes used to simulate larger batch sizes, but can lead to worse generalization.
- use_torch_compile: bool = None¶
Whether to use torch.compile to compile the model for faster training.
- forward_properties: list[str] | None = []¶
Properties (i.e., keys from the batch dict) from the input batch that are used as inputs to the model during the forward pass.
- target_properties: list[str] | None = []¶
Properties (i.e., keys from the batch dict) from the input batch that are used as targets for the model during the forward pass.
- model_config¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].