noether.core.schemas.dataset¶
Attributes¶
Classes¶
Base class for dataset split ID validation with overlap checking. |
|
A specification for a group of named data fields and their dimensions. |
|
Defines the complete data specification for a surrogate model. |
Module Contents¶
- class noether.core.schemas.dataset.DatasetWrapperConfig(/, **data)¶
Bases:
pydantic.BaseModel- Parameters:
data (Any)
- class noether.core.schemas.dataset.RepeatWrapperConfig(/, **data)¶
Bases:
DatasetWrapperConfig- Parameters:
data (Any)
- class noether.core.schemas.dataset.ShuffleWrapperConfig(/, **data)¶
Bases:
DatasetWrapperConfig- Parameters:
data (Any)
- class noether.core.schemas.dataset.SubsetWrapperConfig(/, **data)¶
Bases:
DatasetWrapperConfig- Parameters:
data (Any)
- indices: collections.abc.Sequence | None = None¶
- noether.core.schemas.dataset.DatasetWrappers¶
- class noether.core.schemas.dataset.DatasetBaseConfig(/, **data)¶
Bases:
pydantic.BaseModel- Parameters:
data (Any)
- root: str | None = None¶
Root directory of the dataset. If None, data is not loaded from disk, but somehow generated in memory.
- split: Literal['train', 'val', 'test']¶
- dataset_normalizers: dict[str, list[noether.core.schemas.normalizers.AnyNormalizer]] | None = None¶
List of normalizers to apply to the dataset. The key is the data source name.
- included_properties: set[str] | None = None¶
Set of properties of this dataset that will be loaded, if not set all properties are loaded
- excluded_properties: set[str] | None = None¶
Set of properties of this dataset that will NOT be loaded, even if they are present in the included list
- model_config¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class noether.core.schemas.dataset.DatasetSplitIDs(/, **data)¶
Bases:
pydantic.BaseModel,abc.ABCBase class for dataset split ID validation with overlap checking.
This base class provides: 1. Automatic validation that train/val/test splits don’t have overlapping IDs 2. Optional size validation for datasets that have expected split sizes
Subclasses can optionally define class variables for size validation: - EXPECTED_TRAIN_SIZE: Expected number of training samples - EXPECTED_VAL_SIZE: Expected number of validation samples - EXPECTED_TEST_SIZE: Expected number of test samples - DATASET_NAME: Name of the dataset for error messages
If these are not defined, only overlap checking will be performed.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
data (Any)
- validate_splits()¶
Validate splits and check for overlaps.
- class noether.core.schemas.dataset.FieldDimSpec¶
Bases:
pydantic.RootModel[dict[str,int]]A specification for a group of named data fields and their dimensions.
- property field_slices: dict[str, slice]¶
Calculates slice indices for each field in concatenation order.
- keys()¶
- values()¶
- items()¶
- class noether.core.schemas.dataset.AeroDataSpecs(/, **data)¶
Bases:
pydantic.BaseModelDefines the complete data specification for a surrogate model.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
- Parameters:
data (Any)
- surface_feature_dim: FieldDimSpec | None = None¶
- volume_feature_dim: FieldDimSpec | None = None¶
- surface_output_dims: FieldDimSpec¶
- volume_output_dims: FieldDimSpec | None = None¶
- conditioning_dims: FieldDimSpec | None = None¶
- property surface_feature_dim_total: int¶
Calculates the total surface feature dimension.
- Return type:
- property total_output_dim: int¶
Calculates the total output dimension by summing surface and volume output dimensions.
- Return type: