noether.data.pipeline.sample_processors.duplicate_keys

Classes

DuplicateKeysSampleProcessor

Utility processor that simply duplicates the dictionary keys in a batch.

Module Contents

class noether.data.pipeline.sample_processors.duplicate_keys.DuplicateKeysSampleProcessor(key_map)

Bases: noether.data.pipeline.sample_processor.SampleProcessor

Utility processor that simply duplicates the dictionary keys in a batch.

Duplicates keys in the batch if they are in the key_map. Creates a new dictionary whose keys are duplicated but uses references to the values of the old dict. This avoids copying the data and at the same time does not modify this function’s input.

# dummy example
processor = DuplicateKeysSampleProcessor(key_map={"original_key": "duplicated_key"})

input_sample = {
    "original_key": tensor_data,
}

output_sample = processor(input_sample)
# output_sample['original_key'] will be tensor_data
# output_sample['duplicated_key'] will also be tensor_data
Parameters:

key_map (dict[str, str]) – Dict with source keys as keys and target keys as values. The source keys are duplicated in the samples and the target keys are created. The values of the source keys are used for the target keys.

key_map