noether.core.optimizer.param_group_modifiers¶
Submodules¶
Classes¶
Generic implementation to change properties of optimizer parameter groups. |
|
Scales the learning rate of a certain parameter. |
|
Changes the weight decay value for a single parameter. Use-cases: |
Package Contents¶
- class noether.core.optimizer.param_group_modifiers.ParamGroupModifierBase¶
Generic implementation to change properties of optimizer parameter groups.
- abstractmethod get_properties(model, name, param)¶
Returns the modified properties for a given model parameter. This method is called with all items of model.named_parameters() to compose the parameter groups for the whole model.
- Parameters:
model (torch.nn.Module) – Model from which the parameter originates from. Used to extract properties (e.g., number of layers for a layerwise learning rate decay).
name (str) – Name of the parameter as stored inside the model.
param (torch.Tensor) – The parameter tensor.
- Return type:
- class noether.core.optimizer.param_group_modifiers.LrScaleByNameModifier(param_group_modifier_config)¶
Bases:
noether.core.optimizer.param_group_modifiers.base.ParamGroupModifierBaseScales the learning rate of a certain parameter.
- Parameters:
param_group_modifier_config (noether.core.schemas.optimizers.ParamGroupModifierConfig)
- scale¶
- name¶
- param_was_found = False¶
- get_properties(model, name, param)¶
This method is called with all items of model.named_parameters() to compose the parameter groups for the whole model. If the desired parameter name is found, it returns a modifier that scales down the learning rate.
- Parameters:
model (torch.nn.Module) – Model from which the parameter originates from. Used to extract properties (e.g., number of layers for a layerwise learning rate decay).
name (str) – Name of the parameter as stored inside the model.
param (torch.Tensor) – The parameter tensor.
- Return type:
- class noether.core.optimizer.param_group_modifiers.WeightDecayByNameModifier(param_group_modifier_config)¶
Bases:
noether.core.optimizer.param_group_modifiers.base.ParamGroupModifierBaseChanges the weight decay value for a single parameter. Use-cases: - ViT exclude CLS token parameters - Transformer learned positional embeddings - Learnable query tokens for cross attention (“PerceiverPooling”)
- Parameters:
param_group_modifier_config (noether.core.schemas.optimizers.ParamGroupModifierConfig)
- name¶
- value¶
- param_was_found = False¶
- get_properties(model, name, param)¶
This method is called with all items of model.named_parameters() to compose the parameter groups for the whole model. If the desired parameter name is found, it returns a modifier that sets the weight decay.
- Parameters:
model (torch.nn.Module) – Model from which the parameter originates from. Used to extract properties (e.g., number of layers for a layerwise learning rate decay).
name (str) – Name of the parameter as stored inside the model.
param (torch.Tensor) – The parameter tensor.
- Return type:
- was_applied_successfully()¶
Check if the parameter was found within the model.