noether.inference.evaluate¶
Programmatic eval API — Python-side equivalent of noether-eval.
The CLI does its work in noether.inference.cli.main_inference: it
loads <run_dir>/hp_resolved.yaml as the Hydra base config, injects
resume_* overrides, and dispatches through InferenceRunner. This
module exposes the same flow as a normal function so Python callers (e.g.
notebooks, sweep scripts) don’t have to shell out.
Attributes¶
Functions¶
|
Run evaluation against a training run directory. |
Module Contents¶
- noether.inference.evaluate.logger¶
- noether.inference.evaluate.evaluate(run_dir, *, resume_checkpoint='latest', stage_name=None, callbacks=None, device='cuda', disable_tracker=False)¶
Run evaluation against a training run directory.
Programmatic equivalent of:
noether-eval run_dir=<run_dir> resume_checkpoint=<...> ...
Loads
<run_dir>/hp_resolved.yamlviaHyperparameters.load_resolved(), wires theresume_*fields so checkpoints are read from the training run, optionally replaces the trainer callback list, and dispatches throughInferenceRunner.main()(single-process, no Hydra/CLI involvement).- Parameters:
run_dir (str | pathlib.Path) – Training run output directory — the one that contains
hp_resolved.yaml. Typically<output_path>/<run_id>[/<stage_name>].resume_checkpoint (str) – Checkpoint tag to load. Examples:
"latest","best_model.<metric>","E100"(epoch 100),"U2500"(update 2500),"S40000"(sample 40000).stage_name (str | None) – Optional sub-stage name for this eval run’s outputs. Logs / wandb / saved metrics land under
<run_dir>/<stage_name>/, separate from the training outputs. LeaveNoneto write alongside the training run.callbacks (list[noether.core.schemas.callbacks.CallBackBaseConfig] | None) – If provided, replaces
config.trainer.callbacksfor the eval run. Pass the exact callbacks that should execute (e.g. a single sampling/rollout callback) — nothing from the training config’s callback list is kept.device (str) – Device string passed to the trainer (default
"cuda"). For multi-GPU eval use thenoether-evalCLI; this function is single-process.disable_tracker (bool) – If
True, drop the saved tracker config so eval doesn’t create a new wandb run.
- Raises:
FileNotFoundError – if
run_dirdoesn’t containhp_resolved.yaml.- Return type:
None
Example:
from noether.inference import evaluate from my_recipe.callbacks import SamplingCallbackConfig for steps in [1, 2, 4, 8, 16]: evaluate( run_dir="outputs/abupt_diffusion/30035_2026-05-11_spk1e", resume_checkpoint="best_model.loss.test.total", stage_name=f"eval_steps{steps:02d}", callbacks=[SamplingCallbackConfig(every_n_epochs=1, sampling_steps=steps)], )