noether.data.pipeline.sample_processors.drop_outliers

Classes

DropOutliersSampleProcessor

Drops all outliers from key in a the input sample.

Module Contents

class noether.data.pipeline.sample_processors.drop_outliers.DropOutliersSampleProcessor(item, affected_items=None, min_value=None, max_value=None, min_quantile=None, max_quantile=None)

Bases: noether.data.pipeline.sample_processor.SampleProcessor

Drops all outliers from key in a the input sample.

# dummy example
processor = DropOutliersSampleProcessor(
    item="measurement",
    affected_items={"related_measurement1", "related_measurement2"},
    min_value=0.0,
    max_value=100.0,
)

input_sample = {
    "measurement": torch.tensor([[10.0], [200.0], [-5.0], [50.0]]),
    "related_measurement1": torch.tensor([[1.0], [2.0], [3.0], [4.0]]),
    "related_measurement2": torch.tensor([[5.0], [6.0], [7.0], [8.0]]),
}
output_sample = processor(input_sample)
# output_sample['measurement'] will be tensor([[10.0], [50.0]])
# output_sample['related_measurement1'] will be tensor([[1.0], [4.0]])
# output_sample['related_measurement2'] will be tensor([[5.0], [8.0]])
Parameters:
  • item (str) – The item to drop outliers from.

  • affected_items (set[str] | None) – List of item (keys) that is also affected by outlier removal. Defaults to None.

  • min_value (float | None) – Drop outliers below min_value. Defaults to None.

  • max_value (float | None) – Drop outliers above max_value. Defaults to None.

  • min_quantile (float | None) – Drop outliers in/below min_quantile. Defaults to None.

  • max_quantile (float | None) – Drop outliers in/above max_value. Defaults to None.

item
affected_items = None
min_value = None
max_value = None
min_quantile = None
max_quantile = None