noether.data.zarr_store.manifest

Manifest schema for the chunked/sharded Zarr CFD store.

The manifest is a small JSON sidecar written next to the per-sample Zarr groups. It records the global column layout (which field lives in which array and at which column offset) once, plus the per-sample chunk grid (point count, chunk size, shard size, number of chunks) for every domain.

The dataloader uses the manifest to:

  • pick random chunk indices per (sample, domain) without opening every array’s metadata first, and

  • map the fused value array columns back onto individual named fields.

Classes

ArrayLayout

Layout of a single per-field Zarr array.

DomainLayout

Per-field arrays of one domain.

DomainSample

Per-sample chunk grid for one domain.

SampleEntry

Manifest entry for a single sample.

StoreManifest

Top-level manifest for a converted Zarr store.

Module Contents

class noether.data.zarr_store.manifest.ArrayLayout(/, **data)

Bases: pydantic.BaseModel

Layout of a single per-field Zarr array.

Every field is its own array (<domain>/<name>) so fields can be read independently. The channel axis is never chunked, and each array is packed into a single whole-array shard, so the per-sample object count stays at one object per field while chunks remain individually range-readable.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

data (Any)

array_name: str

Zarr path of the array within the per-sample group, e.g. "volume/velocity".

field: str

Canonical field name served by this array, e.g. "volume_velocity".

dtype: str

On-disk dtype of the array, e.g. "float32" or "float16".

dim: int

Channel width (1 for scalars, 3 for vectors).

class noether.data.zarr_store.manifest.DomainLayout(/, **data)

Bases: pydantic.BaseModel

Per-field arrays of one domain.

All arrays of a domain share the point axis, the shuffle permutation and the chunk grid, so chunk c addresses the same physical points in every field.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

data (Any)

position: str

Canonical name of the domain’s coordinate field (e.g. "volume_position").

arrays: dict[str, ArrayLayout]

Mapping canonical_field -> array layout (includes the position array).

class noether.data.zarr_store.manifest.DomainSample(/, **data)

Bases: pydantic.BaseModel

Per-sample chunk grid for one domain.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

data (Any)

n_points: int

Number of points (rows) for this sample/domain.

chunk_points: int

Chunk size along the point axis.

shard_points: int

Shard size along the point axis — a whole number of chunks; the full array (n_chunks * chunk_points) unless the writer’s shard_points cap split the arrays into multiple shards.

n_chunks: int

Number of chunks along the point axis (ceil(n_points / chunk_points)).

class noether.data.zarr_store.manifest.SampleEntry(/, **data)

Bases: pydantic.BaseModel

Manifest entry for a single sample.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

data (Any)

relpath: str

Path of the sample’s Zarr group relative to the store root.

domains: dict[str, DomainSample]

Per-domain chunk grids.

class noether.data.zarr_store.manifest.StoreManifest(/, **data)

Bases: pydantic.BaseModel

Top-level manifest for a converted Zarr store.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

data (Any)

dataset_name: str
format_version: int = 1
shuffle_seed: int

Base seed used to derive the per-sample point shuffle permutations.

coords_dtype: str = 'float32'
values_dtype: str = 'float16'
compressor: str = 'blosc-zstd'
domains: dict[str, DomainLayout]

Global column layout, shared by every sample.

samples: dict[str, SampleEntry] = None

Per-sample chunk grids keyed by sample id.

MANIFEST_NAME: ClassVar[str] = 'manifest.json'
save(store_root)

Write the manifest to <store_root>/manifest.json (local path or fsspec URL).

Parameters:

store_root (str | pathlib.Path)

Return type:

str

classmethod load(store_root)

Load the manifest from <store_root>/manifest.json (local path or fsspec URL).

Parameters:

store_root (str | pathlib.Path)

Return type:

StoreManifest