noether.core.schemas.dataset¶
Attributes¶
Classes¶
Internal base class for all registry-based configs. |
|
Internal base class for all registry-based configs. |
|
Base config for datasets with fixed splits. |
|
Base class for dataset split ID validation with overlap checking. |
|
A specification for a group of named data fields and their dimensions. |
|
Data specification for a single domain (e.g., surface, volume, wake). |
|
Base data specification for models that operate on arbitrary named domains. |
Module Contents¶
- class noether.core.schemas.dataset.DatasetWrapperConfig(/, **data)¶
Bases:
pydantic.BaseModel- Parameters:
data (Any)
- class noether.core.schemas.dataset.RepeatWrapperConfig(/, **data)¶
Bases:
DatasetWrapperConfig- Parameters:
data (Any)
- class noether.core.schemas.dataset.ShuffleWrapperConfig(/, **data)¶
Bases:
DatasetWrapperConfig- Parameters:
data (Any)
- class noether.core.schemas.dataset.SubsetWrapperConfig(/, **data)¶
Bases:
DatasetWrapperConfig- Parameters:
data (Any)
- indices: collections.abc.Sequence | None = None¶
- class noether.core.schemas.dataset.PipelineConfig(/, **data)¶
Bases:
noether.core.schemas.lib._RegistryBaseInternal base class for all registry-based configs.
Provides auto-registration via __init_subclass__. Not meant to be used directly - use specific config base classes instead.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
data (Any)
- noether.core.schemas.dataset.DatasetWrappers¶
- noether.core.schemas.dataset.TPipelineConfig¶
- class noether.core.schemas.dataset.DatasetBaseConfig[TPipelineConfig: PipelineConfig](/, **data)¶
Bases:
noether.core.schemas.lib._RegistryBaseInternal base class for all registry-based configs.
Provides auto-registration via __init_subclass__. Not meant to be used directly - use specific config base classes instead.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
data (Any)
- pipeline: Annotated[TPipelineConfig | None, Discriminated(PipelineConfig)] = None¶
Config of the pipeline to use for the dataset.
- dataset_normalizers: dict[str, list[Annotated[Any, Discriminated(NormalizerConfig)]] | Annotated[Any, Discriminated(NormalizerConfig)]] | None = None¶
List of normalizers to apply to the dataset. The key is the data source name.
- included_properties: set[str] | None = None¶
Set of properties (i.e., getitem_* methods that are called) of this dataset that will be loaded, if not set all properties are loaded
- excluded_properties: set[str] | None = None¶
Set of properties of this dataset that will NOT be loaded, even if they are present in the included list
- model_config¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class noether.core.schemas.dataset.StandardDatasetConfig(/, **data)¶
Bases:
DatasetBaseConfig,abc.ABCBase config for datasets with fixed splits.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
data (Any)
- split: Literal['train', 'val', 'test']¶
Which split of the dataset to use. Must be one of “train”, “val”, or “test”.
- class noether.core.schemas.dataset.DatasetSplitIDs(/, **data)¶
Bases:
pydantic.BaseModel,abc.ABCBase class for dataset split ID validation with overlap checking.
This base class provides: 1. Automatic validation that train/val/test splits don’t have overlapping IDs 2. Optional size validation for datasets that have expected split sizes
Subclasses can optionally define class variables for size validation: - EXPECTED_TRAIN_SIZE: Expected number of training samples - EXPECTED_VAL_SIZE: Expected number of validation samples - EXPECTED_TEST_SIZE: Expected number of test samples - DATASET_NAME: Name of the dataset for error messages
If these are not defined, only overlap checking will be performed.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
data (Any)
- validate_splits()¶
Validate splits and check for overlaps.
- class noether.core.schemas.dataset.FieldDimSpec¶
Bases:
pydantic.RootModel[collections.OrderedDict[str,int]]A specification for a group of named data fields and their dimensions.
- property field_slices: dict[str, slice]¶
Calculates slice indices for each field in concatenation order.
- keys()¶
- values()¶
- items()¶
- class noether.core.schemas.dataset.DomainDataSpec(/, **data)¶
Bases:
pydantic.BaseModelData specification for a single domain (e.g., surface, volume, wake).
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
data (Any)
- output_dims: FieldDimSpec¶
1, “velocity”: 3}.
- Type:
Output fields and their dimensions for this domain, e.g. {“pressure”
- feature_dim: FieldDimSpec | None = None¶
Input feature fields and their dimensions for this domain.
- class noether.core.schemas.dataset.ModelDataSpecs(/, **data)¶
Bases:
pydantic.BaseModelBase data specification for models that operate on arbitrary named domains.
This is the minimal interface that model configs need from data specifications: position dimensions, available conditioning, and per-domain data descriptions.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
data (Any)
- conditioning_dims: FieldDimSpec | None = None¶
Available conditioning features and their dimensions.
- domains: dict[str, DomainDataSpec] = None¶
Per-domain data specifications keyed by domain name.
- property total_output_dim: int¶
Calculates the total output dimension across all domains.
- Return type: