noether.data.datasets.cfd.shapenet_car.preprocessing

Preprocessing script for ShapeNet car CFD dataset.

This script processes raw VTK files containing CFD simulation data and extracts surface and volume data for machine learning applications.

Attributes

Functions

load_unstructured_grid_data(file_path)

Load unstructured grid data from a VTK file.

unstructured_grid_data_to_poly_data(unstructured_grid_data)

Convert unstructured grid data to poly data by extracting the surface.

get_sdf(target_points, boundary_points)

Calculate signed distance field and normal directions from target points to boundary.

get_normal(unstructured_grid_data)

Compute normalized surface normals from unstructured grid data.

get_simulation_relative_paths(root)

Get relative paths to all valid simulation directories in the dataset.

load_simulation_data(root, simulation_relative_path)

Load and process data for a single simulation.

save_preprocessed_data(save_path, surface_pressure, ...)

Save preprocessed simulation data to disk.

process_single_simulation(root, ...)

Process a single simulation: load data, validate, and save.

main(root, output_dir[, continue_on_error, dry_run, ...])

Preprocess ShapeNet car dataset.

Module Contents

noether.data.datasets.cfd.shapenet_car.preprocessing.logger
noether.data.datasets.cfd.shapenet_car.preprocessing.EXPECTED_SIMULATION_COUNT = 889
noether.data.datasets.cfd.shapenet_car.preprocessing.NUM_PARAM_FOLDERS = 9
noether.data.datasets.cfd.shapenet_car.preprocessing.EPSILON = 1e-08
noether.data.datasets.cfd.shapenet_car.preprocessing.PRESSURE_SURFACE_FILE = 'quadpress_smpl.vtk'
noether.data.datasets.cfd.shapenet_car.preprocessing.VELOCITY_VOLUME_FILE = 'hexvelo_smpl.vtk'
noether.data.datasets.cfd.shapenet_car.preprocessing.PRESSURE_ARRAY_FILE = 'press.npy'
noether.data.datasets.cfd.shapenet_car.preprocessing.EXPECTED_CELL_TYPE = 'quad'
noether.data.datasets.cfd.shapenet_car.preprocessing.EXCLUDED_SIMULATIONS
noether.data.datasets.cfd.shapenet_car.preprocessing.load_unstructured_grid_data(file_path)

Load unstructured grid data from a VTK file.

Parameters:

file_path (pathlib.Path) – Path to the VTK file containing unstructured grid data

Returns:

vtkUnstructuredGrid object containing the loaded data

Return type:

vtk.vtkUnstructuredGrid

noether.data.datasets.cfd.shapenet_car.preprocessing.unstructured_grid_data_to_poly_data(unstructured_grid_data)

Convert unstructured grid data to poly data by extracting the surface.

Parameters:

unstructured_grid_data (vtk.vtkUnstructuredGrid) – vtkUnstructuredGrid object to convert

Returns:

vtkPolyData object representing the surface

Return type:

vtk.vtkPolyData

Note

The surface_filter is kept alive to maintain VTK object references.

noether.data.datasets.cfd.shapenet_car.preprocessing.get_sdf(target_points, boundary_points)

Calculate signed distance field and normal directions from target points to boundary.

Parameters:
  • target_points (numpy.typing.NDArray[numpy.float64]) – (N, 3) array of points where SDF is computed

  • boundary_points (numpy.typing.NDArray[numpy.float64]) – (M, 3) array of boundary surface points

Returns:

  • distances: (N,) array of distances to nearest boundary point

  • directions: (N, 3) array of normalized direction vectors to nearest boundary point

Return type:

Tuple of (distances, directions) where

noether.data.datasets.cfd.shapenet_car.preprocessing.get_normal(unstructured_grid_data)

Compute normalized surface normals from unstructured grid data.

Parameters:

unstructured_grid_data (vtk.vtkUnstructuredGrid) – vtkUnstructuredGrid object containing the mesh

Returns:

(N, 3) array of normalized normal vectors at each point

Raises:

RuntimeError – If NaN values are detected in the computed normals

Return type:

numpy.typing.NDArray[numpy.float64]

Note

This function modifies the input unstructured_grid_data by setting cell normals.

noether.data.datasets.cfd.shapenet_car.preprocessing.get_simulation_relative_paths(root)

Get relative paths to all valid simulation directories in the dataset.

The dataset is organized into 9 parameter folders (param0-param8), each containing multiple simulation subdirectories. Some simulations are excluded due to missing required files.

Parameters:

root (pathlib.Path) – Path to the root directory containing param folders

Returns:

List of Path objects representing relative paths to valid simulations

Raises:
Return type:

list[pathlib.Path]

noether.data.datasets.cfd.shapenet_car.preprocessing.load_simulation_data(root, simulation_relative_path)

Load and process data for a single simulation.

Parameters:
  • root (pathlib.Path) – Root directory containing raw data

  • simulation_relative_path (pathlib.Path) – Relative path to the simulation directory

Returns:

Tuple of (surface_pressure, surface_position, surface_normals, mask,

exterior_points, exterior_velocity, exterior_sdf, exterior_normals)

Raises:
Return type:

tuple[numpy.typing.NDArray, numpy.typing.NDArray, numpy.typing.NDArray, numpy.typing.NDArray, numpy.typing.NDArray, numpy.typing.NDArray, numpy.typing.NDArray, numpy.typing.NDArray]

noether.data.datasets.cfd.shapenet_car.preprocessing.save_preprocessed_data(save_path, surface_pressure, surface_position, surface_normals, mask, exterior_points, exterior_velocity, exterior_sdf, exterior_normals)

Save preprocessed simulation data to disk.

Parameters:
  • save_path (pathlib.Path) – Directory where preprocessed data will be saved

  • surface_pressure (numpy.typing.NDArray) – Surface pressure values

  • surface_position (numpy.typing.NDArray) – Surface point positions

  • surface_normals (numpy.typing.NDArray) – Surface normal vectors

  • mask (numpy.typing.NDArray) – Boolean mask for valid surface points

  • exterior_points (numpy.typing.NDArray) – Exterior (volume) point positions

  • exterior_velocity (numpy.typing.NDArray) – Velocity at exterior points

  • exterior_sdf (numpy.typing.NDArray) – Signed distance field at exterior points

  • exterior_normals (numpy.typing.NDArray) – Normal vectors at exterior points

Return type:

None

noether.data.datasets.cfd.shapenet_car.preprocessing.process_single_simulation(root, simulation_relative_path, output_dir)

Process a single simulation: load data, validate, and save.

Parameters:
  • root (pathlib.Path) – Root directory containing raw data

  • simulation_relative_path (pathlib.Path) – Relative path to the simulation

  • output_dir (pathlib.Path) – Output directory for preprocessed data

Raises:

Various exceptions from data loading and validation

Return type:

None

noether.data.datasets.cfd.shapenet_car.preprocessing.main(root, output_dir, continue_on_error=False, dry_run=False, overwrite=False)

Preprocess ShapeNet car dataset.

Parameters:
  • root (pathlib.Path) – Path to the root directory containing the raw data

  • output_dir (pathlib.Path) – Path to the output directory where preprocessed data will be saved

  • continue_on_error (bool) – If True, continue processing on errors instead of stopping

  • dry_run (bool) – If True, only validate data without saving

  • overwrite (bool) – If True, allow overwriting existing output directory

Returns:

{

‘total’: total simulations, ‘success’: successfully processed, ‘failed’: failed to process

}

Return type:

Dictionary with statistics

Raises:
noether.data.datasets.cfd.shapenet_car.preprocessing.parser