noether.core.schemas.schema

Attributes

Classes

ConfigSchema

Root configuration schema for all experiments in Noether.

Functions

master_port_from_env()

Gets the master port from the environment variable if available.

default_accelerator()

Sets the accelerator if it is not already set.

Module Contents

noether.core.schemas.schema.ACCELERATOR_TYPES
noether.core.schemas.schema.master_port_from_env()

Gets the master port from the environment variable if available.

Return type:

int

noether.core.schemas.schema.default_accelerator()

Sets the accelerator if it is not already set.

Return type:

ACCELERATOR_TYPES

class noether.core.schemas.schema.ConfigSchema(/, **data)

Bases: pydantic.BaseModel

Root configuration schema for all experiments in Noether.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

data (Any)

name: str | None = None

Name of the experiment.

accelerator: ACCELERATOR_TYPES = None

Type of accelerator to use. By default the system choose the best available accelerator. GPU > MPS > CPU.

stage_name: str | None = None

Name of the current stage. I.e., train, finetune, test, etc. When None, the run_id directory is used as output directory. Otherwise, run_id/stage_name is used.

dataset_kind: str | None = None

Kind of dataset to use i.e., class path.

dataset_root: str | None = None

Root directory of the dataset.

resume_run_id: str | None = None

Run ID to resume from. If None, start a new run. This can be used to resume training from the last checkpoint of a previous run when training was interrupted/failed.

resume_stage_name: str | None = None

Stage name to resume from. If None, resume from the default stage.

resume_checkpoint: str | None = None

Path to checkpoint to resume from. If None, the ‘latest’ checkpoint will be used.

seed: int = None

Random seed for reproducibility.

dataset_statistics: dict[str, list[float | int]] | None = None

Pre-computed dataset statistics, e.g., mean and std for normalization. Since some tensors are multi-dimensional, the statistics are stored as lists.

dataset_normalizer: dict[str, list[noether.core.schemas.normalizers.AnyNormalizer]] | None = None

List of normalizers to apply to the dataset. The key is the data source name.

tracker: noether.core.schemas.trackers.AnyTracker | None = None

Configuration for experiment tracking. If None, no tracking is used. If “disabled”, tracking is explicitly disabled. WandB is currently the only supported tracker.

run_id: str | None = None

Unique identifier for the run. If None, a new ID will be generated.

devices: str | None = None

Comma-separated list of device IDs to use. If None, all available devices will be used.

num_workers: int | None = None

Number of worker threads for data loading. If None, will use (#CPUs / #GPUs - 1) workers

cudnn_benchmark: bool = True

Whether to enable cudnn benchmark mode for this run.

cudnn_deterministic: bool = False

Whether to enable cudnn deterministic mode for this run.

datasets: dict[str, noether.core.schemas.dataset.DatasetBaseConfig] = None

Configuration for datasets. The key is the dataset and value is the configuration for that dataset. See DatasetBaseConfig for available options. The key “train” is reserved for the training dataset, but if not provided, the first dataset will be used as training dataset by default, other keys are arbitrary and can be used to identify datasets for different stages, e.g., “train”, “val”, “test”, etc. or different datasets for the same stage, e.g., “train_cfd”, “train_wind_turbine”, etc.

model: noether.core.schemas.models.ModelBaseConfig = None

Configuration for the model. See ModelBaseConfig for available options.

trainer: noether.core.schemas.trainers.BaseTrainerConfig = None

Configuration for the trainer. See BaseTrainerConfig for available options.

debug: bool = False

If True, enables debug mode with more verbose logging, no WandB logging and output written to debug directory.

store_code_in_output: bool = True

If True, store a copy of the current code in the output directory for reproducibility.

output_path: pathlib.Path

Path to output directory.

master_port: int = None

Port for distributed master node. If None, will be set from environment variable MASTER_PORT if available.

slurm: noether.core.schemas.slurm.SlurmConfig | None = None

Configuration for SLURM job submission.

classmethod empty_dict_is_none(v)

Pre-processes tracker input before validation.

Parameters:

v (Any)

Return type:

Any

classmethod validate_output_path(value)

Validates that the output path is valid.

Parameters:

value (pathlib.Path)

Return type:

pathlib.Path

serialize_output_path(value)
Parameters:

value (Any)

Return type:

Any

classmethod get_env_master_port(value)

Sets master_port from environment variable if available.

Parameters:

value (Any)

Return type:

Any

property config_schema_kind: str

The fully qualified import path for the configuration class.

Return type:

str