evedata.evefile.controllers.preprocessing module

Changing (preprocessing) data during import.

Data as read from an HDF5 dataset often need to be processed in some way. Due to the intrinsic strategy of the evedata package to load data only on demand, this cannot simply be done during mapping, but needs to be hooked in to the DataImporter class. Hence, all the preprocessing steps implemented here inherit from ImporterPreprocessingStep.

Note that in this module, only the (more) generic preprocessing steps are implemented. More specific preprocessing steps may be implemented in other modules of the controllers subpackage as well. One example is the mpskip module.

Overview

The following preprocessing steps have been implemented so far:

  • SelectPositions

    Extract rows of data corresponding to a list of positions.

Module documentation

class evedata.evefile.controllers.preprocessing.SelectPositions

Bases: ImporterPreprocessingStep

Extract rows of data corresponding to a list of positions.

When splitting datasets, only part of the data contained in an HDF5 dataset need to be retained. Typically, extracting the correct rows relies on a list of known positions.

This preprocessing step returns those rows whose values in the first column correspond to the values provided as positions

position_counts

Position counts of the dataset to be selected.

These position counts are interpreted as values in the first column of the corresponding HDF5 dataset. Typically, this is the “position count”.

Type:

list | numpy.ndarray

Examples

Selecting positions from a given dataset requires a list or array of positions, and of course the corresponding data:

task = SelectPositions()
task.positions = [2, 4, 5]
result = task.process(data)

The selected data are returned by process(), as shown above.

process(data=None)

Perform the preprocessing step on the data.

The actual task is implemented in the _process() method.

Parameters:

data (any) –

Data loaded from the source.

The actual type of data depends on the source and importer type.

Returns:

data – Processed data.

The actual type of data depends on the source and importer type.

Return type:

any