noether.data.pipeline.sample_processors¶
Submodules¶
- noether.data.pipeline.sample_processors.concat_tensor
- noether.data.pipeline.sample_processors.default_tensor
- noether.data.pipeline.sample_processors.drop_outliers
- noether.data.pipeline.sample_processors.duplicate_keys
- noether.data.pipeline.sample_processors.moment_normalization
- noether.data.pipeline.sample_processors.point_sampling
- noether.data.pipeline.sample_processors.position_normalization
- noether.data.pipeline.sample_processors.rename_keys
- noether.data.pipeline.sample_processors.replace_key
- noether.data.pipeline.sample_processors.supernode_sampling
Classes¶
Concatenates multiple tensors into a single tensor. |
|
Create a tensor with a fixed dummy value, with a specified size. |
|
Drops all outliers from key in a the input sample. |
|
Utility processor that simply duplicates the dictionary keys in a batch. |
|
Normalizes a value with its mean and standard deviation (i.e., its moments). |
|
Randomly subsamples points from a tensor. |
|
Pre-processes data on a sample-level to normalize positions. |
|
Sample processor that simply renames the dictionary keys in a batch. |
|
Sample processor that replaces the key with multiple other keys. |
|
Randomly samples supernodes from a pointcloud. |
Package Contents¶
- class noether.data.pipeline.sample_processors.ConcatTensorSampleProcessor(items, target_key, dim=0)¶
Bases:
noether.data.pipeline.SampleProcessorConcatenates multiple tensors into a single tensor.
# dummy example processor = ConcatTensorSampleProcessor(items=["image_part1", "image_part2"], target_key="full_image", dim=0) input_sample = { "image_part1": torch.randn(3, 224, 224), "image_part2": torch.randn(3, 224, 224), } output_sample = processor(input_sample) # output_sample['full_image'] will be a tensor of shape (6, 224, 224)
- Parameters:
- items¶
- target_key¶
- dim = 0¶
- class noether.data.pipeline.sample_processors.DefaultTensorSampleProcessor(item_key_name, feature_dim, size=None, matching_item_key=None, default_value=0.0)¶
Bases:
noether.data.pipeline.SampleProcessorCreate a tensor with a fixed dummy value, with a specified size.
# dummy example processor = DefaultTensorSampleProcessor( item_key_name="default_tensor", feature_dim=128, size=10, default_value=0.5, ) input_sample = {} output_sample = processor(input_sample) # output_sample['default_tensor'] will be a tensor of shape (10, 128) filled with 0.5
- Parameters:
item_key_name (str) – key of the created default tensor in the output sample dict.
default_value (float) – value to fill the created default tensor with.
feature_dim (int) – size of the feature dimension of the created default tensor.
size (int | None) – size of the first dimension of the created default tensor.
matching_item_key (str | None) – key of an existing tensor in the input sample dict to match the size of the first dimension.
- item_key_name¶
- feature_dim¶
- size = None¶
- matching_item_key = None¶
- default_value = 0.0¶
- class noether.data.pipeline.sample_processors.DropOutliersSampleProcessor(item, affected_items=None, min_value=None, max_value=None, min_quantile=None, max_quantile=None)¶
Bases:
noether.data.pipeline.sample_processor.SampleProcessorDrops all outliers from key in a the input sample.
# dummy example processor = DropOutliersSampleProcessor( item="measurement", affected_items={"related_measurement1", "related_measurement2"}, min_value=0.0, max_value=100.0, ) input_sample = { "measurement": torch.tensor([[10.0], [200.0], [-5.0], [50.0]]), "related_measurement1": torch.tensor([[1.0], [2.0], [3.0], [4.0]]), "related_measurement2": torch.tensor([[5.0], [6.0], [7.0], [8.0]]), } output_sample = processor(input_sample) # output_sample['measurement'] will be tensor([[10.0], [50.0]]) # output_sample['related_measurement1'] will be tensor([[1.0], [4.0]]) # output_sample['related_measurement2'] will be tensor([[5.0], [8.0]])
- Parameters:
item (str) – The item to drop outliers from.
affected_items (set[str] | None) – List of item (keys) that is also affected by outlier removal. Defaults to None.
min_value (float | None) – Drop outliers below min_value. Defaults to None.
max_value (float | None) – Drop outliers above max_value. Defaults to None.
min_quantile (float | None) – Drop outliers in/below min_quantile. Defaults to None.
max_quantile (float | None) – Drop outliers in/above max_value. Defaults to None.
- item¶
- affected_items = None¶
- min_value = None¶
- max_value = None¶
- min_quantile = None¶
- max_quantile = None¶
- class noether.data.pipeline.sample_processors.DuplicateKeysSampleProcessor(key_map)¶
Bases:
noether.data.pipeline.sample_processor.SampleProcessorUtility processor that simply duplicates the dictionary keys in a batch.
Duplicates keys in the batch if they are in the key_map. Creates a new dictionary whose keys are duplicated but uses references to the values of the old dict. This avoids copying the data and at the same time does not modify this function’s input.
# dummy example processor = DuplicateKeysSampleProcessor(key_map={"original_key": "duplicated_key"}) input_sample = { "original_key": tensor_data, } output_sample = processor(input_sample) # output_sample['original_key'] will be tensor_data # output_sample['duplicated_key'] will also be tensor_data
- Parameters:
key_map (dict[str, str]) – Dict with source keys as keys and target keys as values. The source keys are duplicated in the samples and the target keys are created. The values of the source keys are used for the target keys.
- key_map¶
- class noether.data.pipeline.sample_processors.MomentNormalizationSampleProcessor(item, mean=None, std=None, logmean=None, logstd=None, logscale=False)¶
Bases:
noether.data.pipeline.sample_processor.SampleProcessorNormalizes a value with its mean and standard deviation (i.e., its moments).
# dummy example processor = MomentNormalizationSampleProcessor( item="measurement", mean=[10.0], std=[2.0], logscale=False, ) input_sample = { "measurement": torch.tensor([[12.0], [14.0], [8.0]]), "other_item": torch.tensor([[1.0], [2.0], [3.0]]), } output_sample = processor(input_sample) # output_sample['measurement'] will be tensor([[1.0], [2.0], [-1.0]]) # output_sample['other_item'] will be unchanged.
- Parameters:
item (str) – The item (i.e., key in the input sample dictionary) to normalize.
mean (collections.abc.Sequence[float] | None) – The mean of the value. Mandatory if logscale=False.
std (collections.abc.Sequence[float] | None) – The standard deviation of the value. Mandatory if logscale=False.
logmean (collections.abc.Sequence[float] | None) – The mean of the value in logscale. Mandatory if logscale=True.
logstd (collections.abc.Sequence[float] | None) – The standard deviation of the value in logscale. Mandatory if logscale=True.
logscale (bool) – Whether to convert the value to logscale before normalization.
- item¶
- mean_tensor = None¶
- std_tensor = None¶
- logmean_tensor = None¶
- logstd_tensor = None¶
- logscale = False¶
- inverse(key, value)¶
Inverts the normalization from the __call__ method of a single item in the batch.
- Parameters:
key (str) – The name of the item.
value (torch.Tensor) – The value of the item.
- Returns:
The same name and the denormalized value.
- Return type:
(key, value)
- class noether.data.pipeline.sample_processors.PointSamplingSampleProcessor(items, num_points, seed=None)¶
Bases:
noether.data.pipeline.sample_processor.SampleProcessorRandomly subsamples points from a tensor.
# dummy example processor = PointSamplingSampleProcessor( items={"input_position", "output_position"}, num_points=1024, seed=42, ) input_sample = { "input_position": torch.randn(5000, 3), "output_position": torch.randn(5000, 3), "input_features": torch.randn(5000, 6), } output_sample = processor(input_sample) # output_sample['input_position'] will be a tensor of shape (1024, 3) # output_sample['output_position'] will be a tensor of shape (1024, 3) # output_sample['input_features'] will be unchanged. # If input features is also added to items, it will be of shape (1024, 6)
- Parameters:
items (set[str]) – Which pointcloud items should be subsampled (e.g., input_position, output_position, …). If multiple
present (items are)
(e.g. (the subsampling will use identical indices for all items)
downsample (to)
subsampling). (output_position and output_pressure with the same)
num_points (int) – Number of points to sample.
seed (int | None) – Random seed for deterministic sampling for evaluation. Default None (i.e., no seed). If not None, requires sample index to be present in batch.
- items¶
- num_points¶
- seed = None¶
- class noether.data.pipeline.sample_processors.PositionNormalizationSampleProcessor(items, raw_pos_min, raw_pos_max, scale=1000)¶
Bases:
noether.data.pipeline.sample_processor.SampleProcessorPre-processes data on a sample-level to normalize positions.
Should only be used when multiple items should be normalized with the same normalization. If only one item should be normalized, consider using the preprocessor
PositionNormalizerinstead.- Parameters:
items (set[str]) – The position items to normalize. I.e., keys of the input_sample dictionary that should be normalized.
raw_pos_min (collections.abc.Sequence[float]) – The minimum position in the source domain.
raw_pos_max (collections.abc.Sequence[float]) – The maximum position in the source domain.
scale (int | float) – The maximum value of the position. Defaults to 1000.
- items¶
- scale = 1000¶
- raw_pos_min_tensor¶
- raw_pos_max_tensor¶
- raw_size¶
- class noether.data.pipeline.sample_processors.RenameKeysSampleProcessor(key_map)¶
Bases:
noether.data.pipeline.sample_processor.SampleProcessorSample processor that simply renames the dictionary keys in a batch.
Rename keys in the batch if they are in the key_map and keep old keys otherwise. Creates a new dictionary whose keys are renamed but uses references to the values of the old dict. This avoids copying the data and at the same time does not modify this function’s input.
# dummy example processor = RenameKeysSampleProcessor(key_map={"old_key1": "new_key1", "old_key2": "new_key2"}) input_sample = { "old_key1": some_tensor1, "old_key2": some_tensor2, "unchanged_key": some_tensor3, } output_sample = processor(input_sample) # output_sample will be: { # 'new_key1': some_tensor1, # 'new_key2': some_tensor2, # 'unchanged_key': some_tensor3, # }
- Parameters:
key_map (dict[str, str]) – Dict with source keys as keys and target keys as values. The source keys are renamed target keys.
- key_map¶
- class noether.data.pipeline.sample_processors.ReplaceKeySampleProcessor(source_key, target_keys)¶
Bases:
noether.data.pipeline.sample_processor.SampleProcessorSample processor that replaces the key with multiple other keys.
Replaces a key in the batch with one or multiple other keys. Creates a new dictionary whose keys are duplicated but uses references to the values of the old dict. This avoids copying the data and at the same time does not modify this function’s input.
# dummy example processor = ReplaceKeySampleProcessor(source_key="source", target_keys={"target1", "target2"}) input_sample = { "source": some_tensor, "unchanged_key": some_other_tensor, } output_sample = processor(input_sample) # output_sample will be: { # 'target1': some_tensor, # 'target2': some_tensor, # 'unchanged_key': some_other_tensor, # }
- Parameters:
- source_key¶
- target_keys¶
- class noether.data.pipeline.sample_processors.SupernodeSamplingSampleProcessor(item, num_supernodes, supernode_idx_key='supernode_idx', items_at_supernodes=None, seed=None)¶
Bases:
noether.data.pipeline.sample_processor.SampleProcessorRandomly samples supernodes from a pointcloud.
- Parameters:
item (str) – Which key in the input_sample (i.e., pointcloud item) is used to sample supernodes.
num_supernodes (int) – How many supernodes to sample.
items_at_supernodes (set[str] | None) – Selects items at the supernodes (e.g., pressure at supernodes). Defaults to None. These items are sampled accordingly and added to the output supernodes.
seed (int | None) – Random seed for deterministic sampling for evaluation. Default None (i.e., no seed). If not None, requires sample index to be present in batch.
supernode_idx_key (str)
- item¶
- num_supernodes¶
- supernode_idx_key = 'supernode_idx'¶
- items_at_supernodes = None¶
- seed = None¶