noether.data.pipeline.sample_processors¶

Submodules¶

Classes¶

`ConcatTensorSampleProcessor`	Concatenates multiple tensors into a single tensor.
`DefaultTensorSampleProcessor`	Create a tensor with a fixed dummy value, with a specified size.
`DropOutliersSampleProcessor`	Drops all outliers from key in a the input sample.
`DuplicateKeysSampleProcessor`	Utility processor that simply duplicates the dictionary keys in a batch.
`MomentNormalizationSampleProcessor`	Normalizes a value with its mean and standard deviation (i.e., its moments).
`PointSamplingSampleProcessor`	Randomly subsamples points from a tensor.
`PositionNormalizationSampleProcessor`	Pre-processes data on a sample-level to normalize positions.
`RenameKeysSampleProcessor`	Sample processor that simply renames the dictionary keys in a batch.
`ReplaceKeySampleProcessor`	Sample processor that replaces the key with multiple other keys.
`SupernodeSamplingSampleProcessor`	Randomly samples supernodes from a pointcloud.

Package Contents¶

class noether.data.pipeline.sample_processors.ConcatTensorSampleProcessor(items, target_key, dim=0)¶

Bases: noether.data.pipeline.SampleProcessor

Concatenates multiple tensors into a single tensor.

# dummy example
processor = ConcatTensorSampleProcessor(items=["image_part1", "image_part2"], target_key="full_image", dim=0)
input_sample = {
    "image_part1": torch.randn(3, 224, 224),
    "image_part2": torch.randn(3, 224, 224),
}
output_sample = processor(input_sample)
# output_sample['full_image'] will be a tensor of shape (6, 224, 224)

Parameters:

items (list[str]) – A list of keys in the input_sample dict whose tensors should be concatenated.
target_key (str) – The key in the sample dict where the concatenated tensor will be stored.
dim (int) – The dimension along which to concatenate the tensors. Defaults to 0.

items¶

target_key¶

dim = 0¶

class noether.data.pipeline.sample_processors.DefaultTensorSampleProcessor(item_key_name, feature_dim, size=None, matching_item_key=None, default_value=0.0)¶

Bases: noether.data.pipeline.SampleProcessor

Create a tensor with a fixed dummy value, with a specified size.

# dummy example
processor = DefaultTensorSampleProcessor(
    item_key_name="default_tensor",
    feature_dim=128,
    size=10,
    default_value=0.5,
)
input_sample = {}
output_sample = processor(input_sample)
# output_sample['default_tensor'] will be a tensor of shape (10, 128) filled with 0.5

Parameters:

item_key_name (str) – key of the created default tensor in the output sample dict.
default_value (float) – value to fill the created default tensor with.
feature_dim (int) – size of the feature dimension of the created default tensor.
size (int | None) – size of the first dimension of the created default tensor.
matching_item_key (str | None) – key of an existing tensor in the input sample dict to match the size of the first dimension.

item_key_name¶

feature_dim¶

size = None¶

matching_item_key = None¶

default_value = 0.0¶

class noether.data.pipeline.sample_processors.DropOutliersSampleProcessor(item, affected_items=None, min_value=None, max_value=None, min_quantile=None, max_quantile=None)¶

Bases: noether.data.pipeline.sample_processor.SampleProcessor

Drops all outliers from key in a the input sample.

# dummy example
processor = DropOutliersSampleProcessor(
    item="measurement",
    affected_items={"related_measurement1", "related_measurement2"},
    min_value=0.0,
    max_value=100.0,
)

input_sample = {
    "measurement": torch.tensor([[10.0], [200.0], [-5.0], [50.0]]),
    "related_measurement1": torch.tensor([[1.0], [2.0], [3.0], [4.0]]),
    "related_measurement2": torch.tensor([[5.0], [6.0], [7.0], [8.0]]),
}
output_sample = processor(input_sample)
# output_sample['measurement'] will be tensor([[10.0], [50.0]])
# output_sample['related_measurement1'] will be tensor([[1.0], [4.0]])
# output_sample['related_measurement2'] will be tensor([[5.0], [8.0]])

Parameters:

item (str) – The item to drop outliers from.
affected_items (set[str] | None) – List of item (keys) that is also affected by outlier removal. Defaults to None.
min_value (float | None) – Drop outliers below min_value. Defaults to None.
max_value (float | None) – Drop outliers above max_value. Defaults to None.
min_quantile (float | None) – Drop outliers in/below min_quantile. Defaults to None.
max_quantile (float | None) – Drop outliers in/above max_value. Defaults to None.

item¶

affected_items = None¶

min_value = None¶

max_value = None¶

min_quantile = None¶

max_quantile = None¶

class noether.data.pipeline.sample_processors.DuplicateKeysSampleProcessor(key_map)¶

Bases: noether.data.pipeline.sample_processor.SampleProcessor

Utility processor that simply duplicates the dictionary keys in a batch.

Duplicates keys in the batch if they are in the key_map. Creates a new dictionary whose keys are duplicated but uses references to the values of the old dict. This avoids copying the data and at the same time does not modify this function’s input.

# dummy example
processor = DuplicateKeysSampleProcessor(key_map={"original_key": "duplicated_key"})

input_sample = {
    "original_key": tensor_data,
}

output_sample = processor(input_sample)
# output_sample['original_key'] will be tensor_data
# output_sample['duplicated_key'] will also be tensor_data

Parameters:: key_map (dict[str, str]) – Dict with source keys as keys and target keys as values. The source keys are duplicated in the samples and the target keys are created. The values of the source keys are used for the target keys.

key_map¶

class noether.data.pipeline.sample_processors.MomentNormalizationSampleProcessor(item, mean=None, std=None, logmean=None, logstd=None, logscale=False)¶

Bases: noether.data.pipeline.sample_processor.SampleProcessor

Normalizes a value with its mean and standard deviation (i.e., its moments).

# dummy example
processor = MomentNormalizationSampleProcessor(
    item="measurement",
    mean=[10.0],
    std=[2.0],
    logscale=False,
)
input_sample = {
    "measurement": torch.tensor([[12.0], [14.0], [8.0]]),
    "other_item": torch.tensor([[1.0], [2.0], [3.0]]),
}
output_sample = processor(input_sample)
# output_sample['measurement'] will be tensor([[1.0], [2.0], [-1.0]])
# output_sample['other_item'] will be unchanged.

Parameters:

item (str) – The item (i.e., key in the input sample dictionary) to normalize.
mean (collections.abc.Sequence[float] | None) – The mean of the value. Mandatory if logscale=False.
std (collections.abc.Sequence[float] | None) – The standard deviation of the value. Mandatory if logscale=False.
logmean (collections.abc.Sequence[float] | None) – The mean of the value in logscale. Mandatory if logscale=True.
logstd (collections.abc.Sequence[float] | None) – The standard deviation of the value in logscale. Mandatory if logscale=True.
logscale (bool) – Whether to convert the value to logscale before normalization.

item¶

mean_tensor = None¶

std_tensor = None¶

logmean_tensor = None¶

logstd_tensor = None¶

logscale = False¶

inverse(key, value)¶

Inverts the normalization from the __call__ method of a single item in the batch.

Parameters:

key (str) – The name of the item.
value (torch.Tensor) – The value of the item.

Returns:

The same name and the denormalized value.

Return type:

(key, value)

class noether.data.pipeline.sample_processors.PointSamplingSampleProcessor(items, num_points, seed=None)¶

Bases: noether.data.pipeline.sample_processor.SampleProcessor

Randomly subsamples points from a tensor.

# dummy example
processor = PointSamplingSampleProcessor(
    items={"input_position", "output_position"},
    num_points=1024,
    seed=42,
)
input_sample = {
    "input_position": torch.randn(5000, 3),
    "output_position": torch.randn(5000, 3),
    "input_features": torch.randn(5000, 6),
}
output_sample = processor(input_sample)
# output_sample['input_position'] will be a tensor of shape (1024, 3)
# output_sample['output_position'] will be a tensor of shape (1024, 3)
# output_sample['input_features'] will be unchanged.
# If input features is also added to items, it will be of shape (1024, 6)

Parameters:

items (set[str]) – Which pointcloud items should be subsampled (e.g., input_position, output_position, …). If multiple
present (items are)
(e.g. (the subsampling will use identical indices for all items)
downsample (to)
subsampling). (output_position and output_pressure with the same)
num_points (int) – Number of points to sample.
seed (int | None) – Random seed for deterministic sampling for evaluation. Default None (i.e., no seed). If not None, requires sample index to be present in batch.

items¶

num_points¶

seed = None¶

class noether.data.pipeline.sample_processors.PositionNormalizationSampleProcessor(items, raw_pos_min, raw_pos_max, scale=1000)¶

Bases: noether.data.pipeline.sample_processor.SampleProcessor

Pre-processes data on a sample-level to normalize positions.

Should only be used when multiple items should be normalized with the same normalization. If only one item should be normalized, consider using the preprocessor PositionNormalizer instead.

Parameters:

items (set[str]) – The position items to normalize. I.e., keys of the input_sample dictionary that should be normalized.
raw_pos_min (collections.abc.Sequence[float]) – The minimum position in the source domain.
raw_pos_max (collections.abc.Sequence[float]) – The maximum position in the source domain.
scale (int | float) – The maximum value of the position. Defaults to 1000.

items¶

scale = 1000¶

raw_pos_min_tensor¶

raw_pos_max_tensor¶

raw_size¶

class noether.data.pipeline.sample_processors.RenameKeysSampleProcessor(key_map)¶

Bases: noether.data.pipeline.sample_processor.SampleProcessor

Sample processor that simply renames the dictionary keys in a batch.

Rename keys in the batch if they are in the key_map and keep old keys otherwise. Creates a new dictionary whose keys are renamed but uses references to the values of the old dict. This avoids copying the data and at the same time does not modify this function’s input.

# dummy example
processor = RenameKeysSampleProcessor(key_map={"old_key1": "new_key1", "old_key2": "new_key2"})
input_sample = {
    "old_key1": some_tensor1,
    "old_key2": some_tensor2,
    "unchanged_key": some_tensor3,
}

output_sample = processor(input_sample)
# output_sample will be: {
#     'new_key1': some_tensor1,
#     'new_key2': some_tensor2,
#     'unchanged_key': some_tensor3,
# }

Parameters:: key_map (dict[str, str]) – Dict with source keys as keys and target keys as values. The source keys are renamed target keys.

key_map¶

class noether.data.pipeline.sample_processors.ReplaceKeySampleProcessor(source_key, target_keys)¶

Bases: noether.data.pipeline.sample_processor.SampleProcessor

Sample processor that replaces the key with multiple other keys.

Replaces a key in the batch with one or multiple other keys. Creates a new dictionary whose keys are duplicated but uses references to the values of the old dict. This avoids copying the data and at the same time does not modify this function’s input.

# dummy example
processor = ReplaceKeySampleProcessor(source_key="source", target_keys={"target1", "target2"})
input_sample = {
    "source": some_tensor,
    "unchanged_key": some_other_tensor,
}
output_sample = processor(input_sample)
# output_sample will be: {
#     'target1': some_tensor,
#     'target2': some_tensor,
#     'unchanged_key': some_other_tensor,
# }

Parameters:

source_key (str) – Key in the input_sample to be replaced.
target_keys (set[str]) – List of keys where source_key should be replaced in.

source_key¶

target_keys¶

class noether.data.pipeline.sample_processors.SupernodeSamplingSampleProcessor(item, num_supernodes, supernode_idx_key='supernode_idx', items_at_supernodes=None, seed=None)¶

Bases: noether.data.pipeline.sample_processor.SampleProcessor

Randomly samples supernodes from a pointcloud.

Parameters:

item (str) – Which key in the input_sample (i.e., pointcloud item) is used to sample supernodes.
num_supernodes (int) – How many supernodes to sample.
items_at_supernodes (set[str] | None) – Selects items at the supernodes (e.g., pressure at supernodes). Defaults to None. These items are sampled accordingly and added to the output supernodes.
seed (int | None) – Random seed for deterministic sampling for evaluation. Default None (i.e., no seed). If not None, requires sample index to be present in batch.
supernode_idx_key (str)

item¶

num_supernodes¶

supernode_idx_key = 'supernode_idx'¶

items_at_supernodes = None¶

seed = None¶