noether.data.datasets.cfd¶

Submodules¶

Classes¶

`AhmedMLDataset`	Dataset implementation for AhmedML CFD simulations.
`AhmedMLDefaultSplitIDs`	Default split IDs for AhmedML dataset with validation.
`DrivAerMLDataset`	Dataset implementation for DrivaerML CFD simulations.
`DrivAerMLDefaultSplitIDs`	Default split IDs for DrivAerML dataset with validation.
`DrivAerNetDataset`	Dataset implementation for DrivAerNet and DrivAerNet++ dataset.
`EmmiWingDataset`	Dataset implementation for aerodynamic datasets with volume and surface fields.
`EmmiWingHFDataset`	Emmi-Wing dataset loaded from the HuggingFace subset.
`ShapeNetCarDataset`	Dataset implementation for ShapeNet Car CFD simulations.
`ShapeNetCarDefaultSplitIDs`	Default split IDs for ShapeNet Car dataset with validation.
`SimshiftHeatsinkDataset`	Dataset for the SIMSHIFT Heatsink CFD benchmark.

Package Contents¶

class noether.data.datasets.cfd.AhmedMLDataset(dataset_config)¶

Bases: noether.data.datasets.cfd.caeml.dataset.CAEMLDataset

Dataset implementation for AhmedML CFD simulations.

Parameters:: dataset_config (noether.data.base.dataset.StandardDatasetConfig) – Configuration for the dataset.

Initialize the AhmedML dataset.

Parameters:: dataset_config (noether.data.base.dataset.StandardDatasetConfig) – Configuration for the dataset.

STATS_FILE: str = ''¶

property get_dataset_splits: noether.data.base.dataset.DatasetSplitIDs¶

Return type:: noether.data.base.dataset.DatasetSplitIDs

class noether.data.datasets.cfd.AhmedMLDefaultSplitIDs(/, **data)¶

Bases: noether.data.base.dataset.DatasetSplitIDs

Default split IDs for AhmedML dataset with validation.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:: data (Any)

EXPECTED_TRAIN_SIZE = 400¶

EXPECTED_VAL_SIZE = 50¶

EXPECTED_TEST_SIZE = 50¶

DATASET_NAME = 'AhmedML'¶

train: list[int] = [1, 2, 3, 5, 6, 7, 8, 9, 10, 13, 14, 15, 16, 17, 18, 21, 23, 25, 27, 28, 30, 31, 32, 33, 34, 35,...¶

val: list[int] = [24, 26, 29, 38, 41, 55, 59, 104, 108, 124, 133, 142, 158, 173, 180, 188, 196, 197, 199, 205,...¶

test: list[int] = [4, 11, 12, 19, 20, 22, 56, 109, 127, 150, 165, 177, 187, 191, 203, 208, 215, 228, 234, 241,...¶

static create_split()¶: A helper function to create a random split of the dataset. The default indices were created with seed=42.

class noether.data.datasets.cfd.DrivAerMLDataset(dataset_config)¶

Bases: noether.data.datasets.cfd.caeml.dataset.CAEMLDataset

Dataset implementation for DrivaerML CFD simulations.

Parameters:: dataset_config (noether.data.base.dataset.StandardDatasetConfig) – Configuration for the dataset.

Initialize the DrivaerML dataset.

Parameters:: dataset_config (noether.data.base.dataset.StandardDatasetConfig) – Configuration for the dataset.

STATS_FILE: str = ''¶

property get_dataset_splits: noether.data.base.dataset.DatasetSplitIDs¶

Return type:: noether.data.base.dataset.DatasetSplitIDs

class noether.data.datasets.cfd.DrivAerMLDefaultSplitIDs(/, **data)¶

Bases: noether.data.base.dataset.DatasetSplitIDs

Default split IDs for DrivAerML dataset with validation.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:: data (Any)

EXPECTED_TRAIN_SIZE = 400¶

EXPECTED_VAL_SIZE = 34¶

EXPECTED_TEST_SIZE = 50¶

EXPECTED_HIDDEN_TEST_SIZE = 16¶

DATASET_NAME = 'DrivAerML'¶

train: list[int] = [1, 2, 3, 5, 6, 7, 8, 9, 10, 13, 14, 15, 16, 17, 18, 21, 23, 25, 27, 28, 30, 31, 32, 33, 34, 35,...¶

val: list[int] = [4, 22, 56, 109, 150, 165, 177, 191, 228, 234, 241, 247, 252, 253, 260, 271, 275, 298, 303, 311,...¶

test: list[int] = [11, 12, 19, 20, 24, 26, 29, 41, 55, 59, 108, 124, 127, 133, 142, 158, 173, 180, 187, 188, 197,...¶

hidden_test: list[int] = [167, 211, 218, 221, 248, 282, 291, 295, 316, 325, 329, 364, 370, 376, 403, 473]¶

static create_split()¶: A helper function to create a random split of the dataset. The default indices were created with seed=42.

class noether.data.datasets.cfd.DrivAerNetDataset(dataset_config)¶

Bases: noether.data.datasets.cfd.dataset.AeroDataset

Dataset implementation for DrivAerNet and DrivAerNet++ dataset.

Parameters:

dataset_config (noether.data.base.dataset.StandardDatasetConfig) – Configuration for the dataset. See DatasetBaseConfig for available options.
filemap – FileMap object defining the mapping of data properties to filenames. See FileMap for details.

STATS_FILE: str = ''¶

FILEMAP¶

source_root¶

design_ids¶

class noether.data.datasets.cfd.EmmiWingDataset(dataset_config)¶

Bases: noether.data.datasets.cfd.dataset.AeroDataset

Dataset implementation for aerodynamic datasets with volume and surface fields. This unified dataset class provides an interface for aerodynamics dataset with volume and surface fields. The dataset behavior such as the dataset choice, train/val/test split IDs, etc. is configured through constructor parameters, allowing for easy extension to new datasets.

Parameters:

dataset_config (noether.data.base.dataset.StandardDatasetConfig) – Configuration for the dataset. See DatasetBaseConfig for available options.
filemap – FileMap object defining the mapping of data properties to filenames. See FileMap for details.

STATS_FILE: str = ''¶

split¶

source_root¶

sample_info(idx)¶

Get information about a sample such as its local path, run name, etc.

Parameters:: idx (int)
Return type:: dict[str, str | int | None]

property get_dataset_splits: noether.data.base.dataset.DatasetSplitIDs¶

Return type:: noether.data.base.dataset.DatasetSplitIDs

property supported_splits: set[str]¶

Return type:: set[str]

getitem_geometry_design_parameters(idx)¶

Retrieves geometry design parameters as a single tensor.

Returns:: Geometry design parameters tensor of shape (1, num_geometry_parameters)
Return type:: torch.Tensor
Parameters:: idx (int)

getitem_inflow_design_parameters(idx)¶

Retrieves inflow design parameters as a single tensor.

Returns:: Inflow design parameters tensor of shape (1, num_inflow_parameters)
Return type:: torch.Tensor
Parameters:: idx (int)

class noether.data.datasets.cfd.EmmiWingHFDataset(dataset_config)¶

Bases: noether.data.datasets.cfd.emmi_wing.dataset.EmmiWingDataset

Emmi-Wing dataset loaded from the HuggingFace subset.

Uses the 248-case evaluation scan subset with its own train/val/test splits. The dataset can be auto-downloaded from HuggingFace using download().

Parameters:

dataset_config (noether.data.base.dataset.StandardDatasetConfig) – Configuration for the dataset. See DatasetBaseConfig for available options.
filemap – FileMap object defining the mapping of data properties to filenames. See FileMap for details.

property get_dataset_splits: noether.data.base.dataset.DatasetSplitIDs¶

Return type:: noether.data.base.dataset.DatasetSplitIDs

property supported_splits: set[str]¶

Return type:: set[str]

static download(local_dir)¶

Download and extract the HF subset to a local directory.

Downloads scans.zip from HuggingFace, extracts the nested run_N.zip archives into <local_dir>/run_N/ directories, and cleans up the zip files.

Parameters:: local_dir (str) – Destination directory.
Returns:: Path to the extracted dataset root.
Return type:: str

class noether.data.datasets.cfd.ShapeNetCarDataset(dataset_config)¶

Bases: noether.data.datasets.cfd.dataset.AeroDataset

Dataset implementation for ShapeNet Car CFD simulations.

This dataset provides access to: - Surface properties: positions, pressure, normals - Volume properties: positions, velocity, normals, signed distance field (SDF)

The dataset is split by parameter configurations: - Test: param0 (100 samples) - Validation: no validation split defined - Train: param1-8 (789 samples)

Download link to the raw dataset: http://www.nobuyuki-umetani.com/publication/mlcfd_data.zip

Expected directory structure:

root/

preprocessed/

param0/

<simulation_id>/: surface_points.pt surface_pressure.pt surface_normals.pt volume_velocity.pt volume_points.pt volume_sdf.pt volume_normals.pt

param1/

…

… param8/

Initialize the ShapeNet Car dataset.

Parameters:

dataset_config (noether.data.base.dataset.StandardDatasetConfig) – Configuration for the dataset.

Raises:

ValueError – If configuration is invalid or split is unknown
FileNotFoundError – If data directory does not exist

STATS_FILE: str = ''¶

split¶

source_root: pathlib.Path¶

property get_dataset_splits: noether.data.base.dataset.DatasetSplitIDs¶

Return type:: noether.data.base.dataset.DatasetSplitIDs

sample_info(idx)¶

Get information about a sample such as its local path, run name, etc.

Parameters:: idx (int)
Return type:: dict[str, str | int | None]

class noether.data.datasets.cfd.ShapeNetCarDefaultSplitIDs(/, **data)¶

Bases: noether.data.base.dataset.DatasetSplitIDs

Default split IDs for ShapeNet Car dataset with validation.

Following the Transolver paper convention:

param0 is used for test/validation set
param1-8 are used for training set

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:: data (Any)

EXPECTED_TRAIN_SIZE = 789¶

EXPECTED_VAL_SIZE = 0¶

EXPECTED_TEST_SIZE = 100¶

DATASET_NAME = 'ShapeNet-Car'¶

train: list[str] = ['param1/1dc58be25e1b6e5675cad724c63e222e', 'param1/1dc757e77f3cfad0253c03b7df20edd5',...¶

val: list[str] = []¶

test: list[str] = ['param0/100715345ee54d7ae38b52b4ee9d36a3', 'param0/10247b51a42b41603ffe0e5069bf1eb5',...¶

class noether.data.datasets.cfd.SimshiftHeatsinkDataset(dataset_config)¶

Bases: noether.data.Dataset

Dataset for the SIMSHIFT Heatsink CFD benchmark.

The SIMSHIFT Heatsink dataset contains conjugate heat transfer simulations of heatsink geometries with varying fin configurations. Data is stored in HDF5 format with mesh coordinates and element-level physical fields (velocity, temperature, pressure).

The dataset supports source/target domain splits at different difficulty levels for unsupervised domain adaptation experiments.

When root is not provided the dataset is downloaded from HuggingFace Hub and read directly from the zip archive (no extraction needed).

Reference: https://arxiv.org/abs/2506.12007

Parameters:: dataset_config (noether.data.datasets.cfd.simshift_heatsink.config.SimshiftHeatsinkConfig) – Configuration for the dataset. See DatasetBaseConfig for available options including dataset normalizers.