noether.core.schemas.optimizers¶

Classes¶

`ParamGroupModifierConfig`	Configuration for a parameter group modifier. Both for the LrScaleByNameModifier and the WeightDecayByNameModifier,
`OptimizerConfig`

class noether.core.schemas.optimizers.ParamGroupModifierConfig(/, **data)¶

Bases: pydantic.BaseModel

Configuration for a parameter group modifier. Both for the LrScaleByNameModifier and the WeightDecayByNameModifier,

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

kind: str | None = None¶: The class path of the parameter group modifier. Either noether.core.optimizer.param_group_modifiers.LrScaleByNameModifier or noether.core.optimizer.param_group_modifiers.WeightDecayByNameModifier.

scale: float | None = None¶: The scaling factor for the learning rate. Must be greater than 0.0. Only for the LrScaleByNameModifier.

value: float | None = None¶: The weight decay value. With 0.0 the parameter is excluded from the weight decay. Only for the WeightDecayByNameModifier.

name: str¶: The name of the parameter within the model. E.g., ‘backbone.cls_token’.

check_scale_or_value_exclusive()¶

Validates that either ‘scale’ or ‘value’ is provided, but not both. This is a model-level validator that runs after individual field validation.

class noether.core.schemas.optimizers.OptimizerConfig(/, **data)¶

Bases: pydantic.BaseModel

model_config¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

kind: str | None = None¶: The class path of the torch optimizer to use. E.g., ‘torch.optim.AdamW’.

weight_decay: float | None = None¶: The weight decay (L2 penalty) for the optimizer.

clip_grad_value: float | None = None¶: The maximum value for gradient clipping.

clip_grad_norm: float | None = None¶: The maximum norm for gradient clipping.

param_group_modifiers_config: list[ParamGroupModifierConfig] | None = None¶: List of parameter group modifiers to apply. These can modify the learning rate or weight decay for specific parameters.

exclude_bias_from_weight_decay: bool = True¶: If true, excludes the bias parameters (i.e., parameters that end with ‘.bias’) from the weight decay. Default true.

exclude_normalization_params_from_weight_decay: bool = True¶: If true, excludes the weights of normalization layers from the weight decay. This is implemented by excluding all 1D tensors from the weight decay. Default true.

weight_decay_schedule: noether.core.schemas.schedules.AnyScheduleConfig | None = None¶

schedule_config: noether.core.schemas.schedules.AnyScheduleConfig | None = None¶

return_optim_wrapper_args()¶