noether.data.datasets.cfd

Submodules

Classes

AhmedMLDataset

Dataset implementation for AhmedML CFD simulations.

AhmedMLDefaultSplitIDs

Default split IDs for AhmedML dataset with validation.

DrivAerMLDataset

Dataset implementation for DrivaerML CFD simulations.

DrivAerMLDefaultSplitIDs

Default split IDs for DrivAerML dataset with validation.

DrivAerNetDataset

Dataset implementation for DrivAerNet and DrivAerNet++ dataset.

EmmiWingDataset

Dataset implementation for aerodynamic datasets with volume and surface fields.

EmmiWingHFDataset

Emmi-Wing dataset loaded from the HuggingFace subset.

ShapeNetCarDataset

Dataset implementation for ShapeNet Car CFD simulations.

ShapeNetCarDefaultSplitIDs

Default split IDs for ShapeNet Car dataset with validation.

SimshiftHeatsinkDataset

Dataset for the SIMSHIFT Heatsink CFD benchmark.

Package Contents

class noether.data.datasets.cfd.AhmedMLDataset(dataset_config)

Bases: noether.data.datasets.cfd.caeml.dataset.CAEMLDataset

Dataset implementation for AhmedML CFD simulations.

Parameters:

dataset_config (noether.core.schemas.dataset.StandardDatasetConfig) – Configuration for the dataset.

Initialize the AhmedML dataset.

Parameters:

dataset_config (noether.core.schemas.dataset.StandardDatasetConfig) – Configuration for the dataset.

STATS_FILE: str = ''
property get_dataset_splits: noether.core.schemas.dataset.DatasetSplitIDs
Return type:

noether.core.schemas.dataset.DatasetSplitIDs

class noether.data.datasets.cfd.AhmedMLDefaultSplitIDs(/, **data)

Bases: noether.core.schemas.dataset.DatasetSplitIDs

Default split IDs for AhmedML dataset with validation.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

data (Any)

EXPECTED_TRAIN_SIZE = 400
EXPECTED_VAL_SIZE = 50
EXPECTED_TEST_SIZE = 50
DATASET_NAME = 'AhmedML'
train: set[int]
val: set[int]
test: set[int]
static create_split()

A helper function to create a random split of the dataset. The default indices were created with seed=42.

class noether.data.datasets.cfd.DrivAerMLDataset(dataset_config)

Bases: noether.data.datasets.cfd.caeml.dataset.CAEMLDataset

Dataset implementation for DrivaerML CFD simulations.

Parameters:

dataset_config (noether.core.schemas.dataset.StandardDatasetConfig) – Configuration for the dataset.

Initialize the DrivaerML dataset.

Parameters:

dataset_config (noether.core.schemas.dataset.StandardDatasetConfig) – Configuration for the dataset.

STATS_FILE: str = ''
property get_dataset_splits: noether.core.schemas.dataset.DatasetSplitIDs
Return type:

noether.core.schemas.dataset.DatasetSplitIDs

class noether.data.datasets.cfd.DrivAerMLDefaultSplitIDs(/, **data)

Bases: noether.core.schemas.dataset.DatasetSplitIDs

Default split IDs for DrivAerML dataset with validation.

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

data (Any)

EXPECTED_TRAIN_SIZE = 400
EXPECTED_VAL_SIZE = 34
EXPECTED_TEST_SIZE = 50
EXPECTED_HIDDEN_TEST_SIZE = 16
DATASET_NAME = 'DrivAerML'
train: set[int]
val: set[int]
test: set[int]
hidden_test: set[int]
static create_split()

A helper function to create a random split of the dataset. The default indices were created with seed=42.

class noether.data.datasets.cfd.DrivAerNetDataset(dataset_config)

Bases: noether.data.datasets.cfd.dataset.AeroDataset

Dataset implementation for DrivAerNet and DrivAerNet++ dataset.

Parameters:
STATS_FILE: str = ''
FILEMAP
source_root
design_ids
class noether.data.datasets.cfd.EmmiWingDataset(dataset_config)

Bases: noether.data.datasets.cfd.dataset.AeroDataset

Dataset implementation for aerodynamic datasets with volume and surface fields. This unified dataset class provides an interface for aerodynamics dataset with volume and surface fields. The dataset behavior such as the dataset choice, train/val/test split IDs, etc. is configured through constructor parameters, allowing for easy extension to new datasets.

Parameters:
STATS_FILE: str = ''
split
source_root
sample_info(idx)

Get information about a sample such as its local path, run name, etc.

Parameters:

idx (int)

Return type:

dict[str, str | int | None]

property get_dataset_splits: noether.core.schemas.dataset.DatasetSplitIDs
Return type:

noether.core.schemas.dataset.DatasetSplitIDs

property supported_splits: set[str]
Return type:

set[str]

getitem_geometry_design_parameters(idx)

Retrieves geometry design parameters as a single tensor.

Returns:

Geometry design parameters tensor of shape (1, num_geometry_parameters)

Return type:

torch.Tensor

Parameters:

idx (int)

getitem_inflow_design_parameters(idx)

Retrieves inflow design parameters as a single tensor.

Returns:

Inflow design parameters tensor of shape (1, num_inflow_parameters)

Return type:

torch.Tensor

Parameters:

idx (int)

class noether.data.datasets.cfd.EmmiWingHFDataset(dataset_config)

Bases: noether.data.datasets.cfd.emmi_wing.dataset.EmmiWingDataset

Emmi-Wing dataset loaded from the HuggingFace subset.

Uses the 248-case evaluation scan subset with its own train/val/test splits. The dataset can be auto-downloaded from HuggingFace using download().

Parameters:
property get_dataset_splits: noether.core.schemas.dataset.DatasetSplitIDs
Return type:

noether.core.schemas.dataset.DatasetSplitIDs

property supported_splits: set[str]
Return type:

set[str]

static download(local_dir)

Download and extract the HF subset to a local directory.

Downloads scans.zip from HuggingFace, extracts the nested run_N.zip archives into <local_dir>/run_N/ directories, and cleans up the zip files.

Parameters:

local_dir (str) – Destination directory.

Returns:

Path to the extracted dataset root.

Return type:

str

class noether.data.datasets.cfd.ShapeNetCarDataset(dataset_config)

Bases: noether.data.datasets.cfd.dataset.AeroDataset

Dataset implementation for ShapeNet Car CFD simulations.

This dataset provides access to: - Surface properties: positions, pressure, normals - Volume properties: positions, velocity, normals, signed distance field (SDF)

The dataset is split by parameter configurations: - Test: param0 (100 samples) - Validation: no validation split defined - Train: param1-8 (789 samples)

Download link to the raw dataset: http://www.nobuyuki-umetani.com/publication/mlcfd_data.zip

Expected directory structure:
root/
preprocessed/
param0/
<simulation_id>/

surface_points.pt surface_pressure.pt surface_normals.pt volume_velocity.pt volume_points.pt volume_sdf.pt volume_normals.pt

param1/

… param8/

Initialize the ShapeNet Car dataset.

Parameters:

dataset_config (noether.core.schemas.dataset.StandardDatasetConfig) – Configuration for the dataset.

Raises:
STATS_FILE: str = ''
split
source_root: pathlib.Path
property get_dataset_splits: noether.core.schemas.dataset.DatasetSplitIDs
Return type:

noether.core.schemas.dataset.DatasetSplitIDs

sample_info(idx)

Get information about a sample such as its local path, run name, etc.

Parameters:

idx (int)

Return type:

dict[str, str | int | None]

class noether.data.datasets.cfd.ShapeNetCarDefaultSplitIDs(/, **data)

Bases: noether.core.schemas.dataset.DatasetSplitIDs

Default split IDs for ShapeNet Car dataset with validation.

Following the Transolver paper convention:
  • param0 is used for test/validation set

  • param1-8 are used for training set

Create a new model by parsing and validating input data from keyword arguments.

Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.

self is explicitly positional-only to allow self as a field name.

Parameters:

data (Any)

EXPECTED_TRAIN_SIZE = 789
EXPECTED_VAL_SIZE = 0
EXPECTED_TEST_SIZE = 100
DATASET_NAME = 'ShapeNet-Car'
train: set[str]
val: set[str]
test: set[str]
class noether.data.datasets.cfd.SimshiftHeatsinkDataset(dataset_config)

Bases: noether.data.Dataset

Dataset for the SIMSHIFT Heatsink CFD benchmark.

The SIMSHIFT Heatsink dataset contains conjugate heat transfer simulations of heatsink geometries with varying fin configurations. Data is stored in HDF5 format with mesh coordinates and element-level physical fields (velocity, temperature, pressure).

The dataset supports source/target domain splits at different difficulty levels for unsupervised domain adaptation experiments.

When root is not provided the dataset is downloaded from HuggingFace Hub and read directly from the zip archive (no extraction needed).

Reference: https://arxiv.org/abs/2506.12007

Parameters:

dataset_config (noether.data.datasets.cfd.simshift_heatsink.config.SimshiftHeatsinkConfig) – Configuration for the dataset. See DatasetBaseConfig for available options including dataset normalizers.

STATS_FILE: str = ''
difficulty
domain
split
pre_getitem(idx)

Load all fields for sample idx from its HDF5 file.

The returned dict is forwarded as kwargs to every getitem_* method.

Parameters:

idx (int)

Return type:

dict[str, torch.Tensor]

getitem_volume_position(idx, *, position, **_)

Element centre coordinates of the volume mesh (num_elements, 3).

Parameters:
Return type:

torch.Tensor

getitem_volume_velocity(idx, *, velocity, **_)

Velocity field at element centres (num_elements, 3).

Parameters:
Return type:

torch.Tensor

getitem_volume_temperature(idx, *, temperature, **_)

Temperature field at element centres (num_elements, 1).

Parameters:
Return type:

torch.Tensor

getitem_volume_pressure(idx, *, pressure, **_)

Pressure (p_rgh) field at element centres (num_elements, 1).

Parameters:
Return type:

torch.Tensor

getitem_simulation_parameters(idx)

Geometry design parameters conditioning vector (num_params,).

Parameters:

idx (int)

Return type:

torch.Tensor

sample_info(idx)

Get information about a sample such as its path, sample ID, etc.

Parameters:

idx (int)

Return type:

dict[str, str | int | None]