noether.core.schemas.slurm¶
Classes¶
Configuration for SLURM job submission via |
Module Contents¶
- class noether.core.schemas.slurm.SlurmConfig(/, **data)¶
Bases:
pydantic.BaseModelConfiguration for SLURM job submission via
submitit.Field names mirror the keyword arguments accepted by
submitit.AutoExecutor.update_parameters(). All fields are optional and default toNone, meaning the cluster default is used.Note
Job stdout/stderr is owned by submitit and written to
<folder>/<job_id>_log.out/<folder>/<job_id>_log.err. Use thefolderfield to control where these files land. SLURM--output/--errordirectives are intentionally not exposed; pass them viaslurm_additional_parametersif you really need to override submitit’s defaults (this disablesjob.stdout()helpers).Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
data (Any)
- model_config¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- folder: str = 'submitit_logs'¶
Directory where submitit writes the job script, pickled task, and stdout/stderr logs. Per-job files are named
<job_id>_log.outetc. inside this directory. This is also used as the defaultoutput_pathfor training runs (seeConfigSchema.output_path).Supports
%u(current username) interpolation, e.g./home/%u/logs/experiment. SLURM job-time patterns like%jare not supported because submitit needs the directory to exist before submission.
- gpus_per_node: int | str | None = None¶
GPUs per node. Accepts a count or
type:count(e.g."a100:4").
- slurm_array_parallelism: int | None = None¶
Maximum number of array tasks running concurrently (SLURM
%Nin--array).
- slurm_setup: list[str] | None = None¶
Shell commands run inside the job before the main command, e.g.
["source .venv/bin/activate"].
- slurm_additional_parameters: dict[str, Any] | None = None¶
Escape hatch for SLURM directives not exposed as first-class fields, e.g.
{"nice": 0, "reservation": "my_res", "chdir": "/work"}. Keys are passed as--key=valuetosbatch.
- to_executor_kwargs()¶
Return
(folder, update_parameters_kwargs)forsubmitit.AutoExecutor.Generic fields are passed under their bare name; everything else keeps its
slurm_prefix so submitit routes it to the slurm executor.