noether.inference¶
Submodules¶
Classes¶
Handle to a trained run. |
|
Runs an inference experiment using @hydra.main as entry point. |
Functions¶
|
Run evaluation against a training run directory. |
Package Contents¶
- noether.inference.evaluate(run_dir, *, resume_checkpoint='latest', stage_name=None, callbacks=None, config_overrides=None, device='cuda', disable_tracker=False)¶
Run evaluation against a training run directory.
Programmatic equivalent of:
noether-eval run_dir=<run_dir> resume_checkpoint=<...> ...
Loads
<run_dir>/hp_resolved.yamlviaHyperparameters.load_resolved(), wires theresume_*fields so checkpoints are read from the training run, optionally replaces the trainer callback list, and dispatches throughInferenceRunner.main()(single-process, no Hydra/CLI involvement).- Parameters:
run_dir (str | pathlib.Path) – Training run output directory — the one that contains
hp_resolved.yaml. Typically<output_path>/<run_id>[/<stage_name>].resume_checkpoint (str) – Checkpoint tag to load. Examples:
"latest","best_model.<metric>","E100"(epoch 100),"U2500"(update 2500),"S40000"(sample 40000).stage_name (str | None) – Optional sub-stage name for this eval run’s outputs. Logs / wandb / saved metrics land under
<run_dir>/<stage_name>/, separate from the training outputs. LeaveNoneto write alongside the training run.callbacks (list[noether.core.schemas.callbacks.CallBackBaseConfig] | None) – If provided, replaces
config.trainer.callbacksfor the eval run. Pass the exact callbacks that should execute (e.g. a single sampling/rollout callback) — nothing from the training config’s callback list is kept.config_overrides (collections.abc.Callable[[noether.core.schemas.schema.ConfigSchema], None] | None) – Optional callable invoked on the loaded
ConfigSchemaafter resume / callbacks wiring and before dispatch. Use it to toggle dataset settings that the training config baked out.device (str) – Device string passed to the trainer (default
"cuda"). For multi-GPU eval use thenoether-evalCLI; this function is single-process.disable_tracker (bool) – If
True, drop the saved tracker config so eval doesn’t create a new wandb run.
- Raises:
FileNotFoundError – if
run_dirdoesn’t containhp_resolved.yaml.- Return type:
None
Example:
from noether.inference import evaluate from my_recipe.callbacks import SamplingCallbackConfig for steps in [1, 2, 4, 8, 16]: evaluate( run_dir="outputs/abupt_diffusion/30035_2026-05-11_spk1e", resume_checkpoint="best_model.loss.test.total", stage_name=f"eval_steps{steps:02d}", callbacks=[SamplingCallbackConfig(every_n_epochs=1, sampling_steps=steps)], )
- class noether.inference.Run(run_dir)¶
Handle to a trained run.
Two construction modes, picked by which constructor you use:
Run(run_dir) — full run directory: readshp_resolved.yamland validates it againstConfigSchema. All accessors below are available.Run.from_checkpoint()(path) — just a single..._model.thfile: reads embedded model config + normalizer payload.model()andnormalizers()work;dataset(),config, andstatisticsraise.
Mutate
configbetween construction and the lazy methods to override training-time values (typically dataset roots when the run was produced on a different machine). Only meaningful in run-dir mode.- Parameters:
run_dir (pathlib.Path | str) – Path to the training run output directory (the one that contains
hp_resolved.yamland acheckpoints/subdirectory). Typicallyoutput_path/run_idoroutput_path/run_id/stage_name.
- run_dir¶
Resolved absolute path to the run directory in run-dir mode;
Nonein checkpoint-only mode.
- checkpoint_path¶
Resolved absolute path to the
.thfile in checkpoint-only mode;Nonein run-dir mode.
- Raises:
FileNotFoundError – If
run_dirdoes not exist or doesn’t containhp_resolved.yaml.- Parameters:
run_dir (pathlib.Path | str)
Example
from noether.inference import Run # Bring-your-own-data flow: apply the trained model to a custom input dict, then denormalize the predictions. run = Run.from_checkpoint("/outputs/.../ab_upt_cp=last_model.th") model = run.model(device="cuda") norms = run.normalizers() with torch.inference_mode(): pred = model(**my_inputs) pred_phys = norms["surface_pressure"].inverse(pred["surface_pressure"])
- run_dir: pathlib.Path | None¶
- checkpoint_path: pathlib.Path | None = None¶
- classmethod from_checkpoint(checkpoint_path)¶
Build a
Runfrom a single..._model.thfile.Reads the model config (
CheckpointKeys.MODEL_CONFIG), the discriminator kind (CheckpointKeys.CONFIG_KIND), and — if present — the per-field normalizer payload (CheckpointKeys.NORMALIZER_CONFIGS/CheckpointKeys.NORMALIZER_STATISTICS) thatCheckpointWriterembeds in every checkpoint.The model class itself must still be importable in the current process — the kind string points at a class, not at its implementation. If the checkpoint references a recipe-specific model, make sure that recipe is installed (or on
sys.path) before calling.- Parameters:
checkpoint_path (pathlib.Path | str) – Path to a
..._model.thfile written by noether.- Returns:
A
Runin checkpoint-only mode.model()andnormalizers()are usable;dataset()andconfigraise.- Raises:
FileNotFoundError – If the checkpoint file does not exist.
KeyError – If the checkpoint is missing any of
state_dict,model_config, orconfig_kind(older checkpoints predate the embedded config — fall back toRun(run_dir)).
- Return type:
- property is_checkpoint_only: bool¶
Trueif thisRunwas built viafrom_checkpoint()(no run dir, no resolved config).- Return type:
- property config: noether.core.schemas.schema.ConfigSchema¶
Validated
ConfigSchemaloaded fromhp_resolved.yaml.Safe to mutate before calling
dataset()/model()/normalizers().- Raises:
RuntimeError – If this
Runwas built viafrom_checkpoint()— no run directory means no resolved config.- Return type:
- property statistics: dict[str, list[float | int]]¶
Training-time dataset statistics (
config.dataset_statisticsor{}).Convenience accessor for the stat values the training run computed — typically per-field means/stds used by the trainer’s pipeline. Returns an empty dict if the run didn’t compute any stats.
Note: this is separate from the dataset class’s static
STATS_FILE, whichnormalizers()reads in run-dir mode.
- normalizers(split='test')¶
Build the trained run’s field normalizers without instantiating its dataset.
In run-dir mode, reads the dataset class’s
STATS_FILE(looked up fromconfig.datasets[split].kind) and constructs each normalizer fromconfig.datasets[split].dataset_normalizers. The data root is never touched.In checkpoint-only mode, reads the per-field preprocessor configs and resolved statistics that
CheckpointWriterembeds in every checkpoint (NORMALIZER_CONFIGS/NORMALIZER_STATISTICS). Thesplitargument is ignored — only the writer-side split (typicallytest) was captured.- Parameters:
split (str) – Dataset key to source the normalizer configs from. Splits typically share normalizers; the arg is provided for parity with
dataset(). Ignored in checkpoint-only mode.- Returns:
Dict mapping field name (e.g.
"surface_pressure") to aComposePreProcess. Empty dict if no normalizers are available for this split.- Raises:
KeyError – In run-dir mode, if
splitis not inself.config.datasets. In checkpoint-only mode, if the checkpoint predates the embedded normalizer keys.- Return type:
dict[str, noether.data.preprocessors.compose.ComposePreProcess]
- dataset(split='test')¶
Instantiate the dataset for
split.Wires up the collator (
dataset.pipeline) the same way the trainer does, so the dataset can be plugged into atorch.utils.data.DataLoaderfor batched forward passes.- Parameters:
split (str) – Dataset key (e.g.
"train","val","test").- Raises:
RuntimeError – In checkpoint-only mode (the checkpoint doesn’t know about the original dataset configuration).
KeyError – If
splitis not inself.config.datasets.
- Return type:
- model(*, checkpoint='latest', device='cpu')¶
Instantiate the model and load checkpoint weights.
Unlike the training/eval flow, this does not set up an optimizer, apply initializers, or attach the model to a trainer — it just builds the model, loads the state dict, moves it to
device, and puts it in eval mode.- Parameters:
checkpoint (str) – Checkpoint tag (run-dir mode only). Defaults to
"latest". Other examples:"E10","best_model.loss.test.total". Ignored in checkpoint-only mode — the file was already fixed atfrom_checkpoint()time.device (str | torch.device) – Torch device (or string) to move the model to.
- Returns:
The model in eval mode with weights loaded.
- Raises:
FileNotFoundError – If the checkpoint file does not exist (run-dir mode).
KeyError – If the checkpoint is missing
state_dict.RuntimeError – If loading the state dict did not actually change the model weights (sanity check against silently missing or mismatched keys).
- Return type:
- class noether.inference.InferenceRunner¶
Bases:
noether.training.runners.hydra_runner.HydraRunnerRuns an inference experiment using @hydra.main as entry point.