evedata.evefile.controllers.version_mapping module

Mapping eveH5 contents to the data structures of the evedata package.

There are different versions of the schema underlying the eveH5 files. Hence, mapping the contents of an eveH5 file to the data model of the evedata package requires to get the correct mapper for the specific version. This is the typical use case for the factory pattern.

Users of the module hence will typically only obtain a VersionMapperFactory object to get the correct mappers for individual files. Furthermore, “users” basically boils down to the EveFile class. Therefore, users of the evedata package usually do not interact directly with any of the classes provided by this module.

Overview

Being version agnostic with respect to eveH5 and SCML schema versions is a central aspect of the evedata package. This requires facilities mapping the actual eveH5 files to the data model provided by the entities technical layer of the evefile subpackage. The File facade obtains the correct VersionMapper object via the VersionMapperFactory, providing an HDF5File resource object to the factory. It is the duty of the factory to obtain the “version” attribute from the HDF5File object (explicitly getting the attributes of the root group of the HDF5File object).

../../../_images/evedata.evefile.controllers.version_mapping.svg

Fig. 32 Class hierarchy of the evedata.evefile.controllers.version_mapping module, providing the functionality to map different eveH5 file schemas to the data structure provided by the EveFile class. The factory will be used to get the correct mapper for a given eveH5 file. For each eveH5 schema version, there exists an individual VersionMapperVx class dealing with the version-specific mapping. The idea behind the Mapping class is to provide simple mappings for attributes and alike that need not be hard-coded and can be stored externally, e.g. in YAML files. This would make it easier to account for (simple) changes.

For each eveH5 schema version, there exists an individual VersionMapperVx class dealing with the version-specific mapping. That part of the mapping common to all versions of the eveH5 schema takes place in the VersionMapper parent class, e.g. removing the chain. The idea behind the Mapping class is to provide simple mappings for attributes and alike that can be stored externally, e.g. in YAML files. This would make it easier to account for (simple) changes.

Mapping tasks for eveH5 schema up to v7

Given the quite different overall philosophy of the current eveH5 file schema (up to version v7) and the data model provided by the evedata package, there is a lot of tasks for the mappers to be done.

What follows is a summary of the different aspects, for the time being not divided for the different formats (up to v7):

  • Map attributes of / and /c1 to the file metadata. ✓

  • Convert monitor datasets from the device group to MonitorData objects. ✓

    • We probably need to create subclasses for the different monitor datasets, at least distinguishing between numeric and non-numeric values.

  • Map /c1/meta/PosCountTimer to TimestampData object. ✓

  • Starting with eveH5 v5: Map /LiveComment to LogMessage objects. ✓

  • Filter all datasets from the main section, with different goals:

    • Map array data to ArrayChannelData objects (HDF5 groups having an attribute DeviceType set to Channel). ✓

      • Distinguish between MCA and scope data (at least). ✗

      • Map additional datasets in main section (and snapshot). ✓

    • Handle MPSKIP channel(s) if present. ✓

    • Map all axis datasets to AxisData objects. ✓

      • How to distinguish between axes with and without encoders? ✗

      • Read channels with RBV and replace axis values with RBV. ✗

        • Most probably, the corresponding channel has the same name (not XML-ID, though!) as the axis, but with suffix _RBV, and can thus be identified.

        • In case of axes with encoders, there may be additional datasets present, e.g., those with suffix _Enc.

        • In this case, instead of NonencodedAxisData, an AxisData object needs to be created. (Currently, only AxisData objects are created, what is a mistake as well…)

      • How to deal with pseudo-axes used as options in channel datasets? Do we need to deal with axes later? ✗

    • Distinguish between single point and area data, and map area data to AreaChannelData objects. (✓)

      • Distinguish between scientific and sample cameras. ✓

      • Which dataset is the “main” dataset for scientific cameras? ✗

        • Starting with eve v1.39, it is TIFF1:chan1, before, this is less clear, and there might not exist a dataset containing filenames with full paths, but only numbers.

      • Map sample camera datasets. ✓

    • Figure out which single point data have been redefined between scan modules, and split data accordingly. Map the data to SinglePointChannelData, AverageChannelData, and IntervalChannelData, respectively. ✗

      Hint: Getting the shape of an HDF5 dataset is a cheap operation and does not require reading the actual data, as the information is contained in the metadata of the HDF5 dataset. This should allow for additional checking whether a dataset has been redefined.

      If the number of (the sum of) positions differ, the channel has been redefined. However, the average or interval settings may have changed between scan modules as well, and this can only be figured out by actually reading the data. How to handle this situation? Split datasets only upon reading the data, if necessary?

      Take care of normalized channel data and treat them accordingly.

    • Map the additional data for average and interval channel data provided in the respective HDF5 groups to AverageChannelData and IntervalChannelData objects, respectively. ✓

    • Map normalized channel data (and the data provided in the respective HDF5 groups) to NormalizedChannelData. ✓

    • Map all remaining HDF5 datasets that belong to one of the already mapped data objects (i.e., variable options) to their respective attributes. (Should have been done already)

    • Map all HDF5 datasets remaining (if any) to data objects corresponding to their respective data type. (Could there be any?)

    • Add all data objects to the data attribute of the EveFile object. (Has been done during mapping already.)

  • Filter all datasets from the snapshot section, with different goals:

    • Map all HDF5 datasets that belong to one of the data objects in the data attribute of the EveFile object to their respective attributes.

    • Map all HDF5 datasets remaining (if any) to data objects corresponding to their respective data type.

    • Add all data objects to the snapshots attribute of the EveFile object. ✓

Most probably, not all these tasks can be inferred from the contents of an eveH5 file alone. In this case, additional mapping tables, eventually perhaps even on a per-measurement-station level, are necessary.

Other tasks not in the realm of the version mappers, but part of the evedata.evefile.controllers subpackage, are:

  • Separating 0D data that have been redefined within a scan (single point, average, interval) – sure about this one? see above

  • Mapping scans using the EPICS MPSKIP feature to record individual values for actual average detectors to AverageChannelData objects.

Todo

In light of the newly added scan modules layer and the necessary mapping of datasets to scan modules: Where and how to check whether creating position (count)s during reading the SCML did work (consistency check) and where to actually distribute the datasets to the scan modules?

Probably the best way is to first map all datasets from main to dataset objects within the mapper, and only afterwards (deep)copy these dataset objects where necessary and distribute them to the scan modules, adding the preprocessing step selecting position counts to the respective importer(s).

When exactly the MPSKIP scans are dealt with needs to be decided. Definitely, the general mapping of datasets needs to be done first, as only this creates and maps the special SkipData dataset necessary to carry out the tasks of the mpskip module.

Questions to address

  • How were the log messages/live comments saved before v5?

  • How to deal with options that are monitored? Check whether they change for a given channel/axis and if so, expand them (“fill”) for each PosCount of the corresponding channel/axis, and otherwise set as scalar attribute?

  • How to deal with the situation that not all actual data read from eveH5 are numeric. Of course, non-numeric data cannot be plotted. But how to distinguish sensibly?

Notes on mapping MCA datasets

MCA data themselves are stored as single dataset per spectrum in an HDF5 group, and such group can be uniquely identified by having attributes, and an attribute DeviceType set to Channel. Furthermore, he PV of a given MCA can be inferred from the Access attribute of the HDF5 group.

Why not using the name of the MCA HDF5 group for obtaining the PV? The group typically has chan1 added without separator straight to the PV name, but the Access attribute reveals the full PV with added .VAL attribute.

As all additional options follow directly the EPICS MCA record, and the dataset names can be mapped to the PVs of the MCA record, a direct mapping of datasets in the main and snapshot sections could be carried out. In this case, it seems not necessary to explicitly check the PV names of the individual datasets, as the datasets all have the PV attributes as their last part. Note that there are different and variable numbers of ROI channels and corresponding datasets available (up to 32 according to the EPICS MCA record, but probably <10 at PTB).

How to map the values of the snapshot section to the options of the MCAChannelData and MCAChannelROIData classes? Check whether they have changed, and if not, use the first value? How to deal with the situation where the values in the snapshot dataset have changed? This would most probably mean that the MCA has been used with different settings in different scan modules of the scan and would need to be split into different datasets. However, this is only accessible once the data have been read. Again, two scenarios would be possible: (i) postpone the whole procedure to the data import in the MCAChannelData class, or (ii) load the snapshot data during mapping, as this should usually only be small datasets, and deal with the differing values already here.

Notes on mapping camera datasets

Most probably, camera datasets can be identified by having (at least) two colons in their name. Furthermore, the second-right part between two colons should be one of TIFF1 or cam1 for scientific cameras and uvc1 for sample cameras.

Having once identified one dataset belonging to a camera, all related datasets can be identified by the identical part before the first colon. Note that this criterion is not valid for other datasets not belonging to cameras.

Identifying the “main” dataset for a camera is another task, as over time, this has changed as well, from storing image numbers to storing (full) filenames.

How to map the values of the snapshot section to the respective camera classes? The same ideas as for the MCA datasets apply here, too - and probably more generally for all snapshot datasets, at least those where corresponding devices exist in the main section.

Notes on mapping MPSKIP channels

MPSKIP channels are (currently) only present at SX700 and EUVR stations. This is a special EPICS detector used to record individual values to average over and at the same time a series of axes RBVs.

In a typical scan, there are (up to) three channel datasets as well as a series of monitor datasets present. Fortunately, the PV naming scheme of the MPSKIP device is generic, the base name is always: MPSKIP:<station><number>. The actual names (as seen in the GUI) are much less consistent, though. The three channel datasets are:

  • MPSKIP:<station><number>chan1

    • The name of this channel is SkipDetektor<station>.

    • The values of this channel would theoretically be the counts, but unfortunately the channel seems to count wrongly. Hence, the values (and the entire dataset) should be ignored.

  • MPSKIP:<station><number>counterchan1

    • The name of this channel is <station>-Scounter.

    • The values of this channel are the counts, with “1” being repeated if the comparison does not succeed.

    • This channel is not present in all scans, hence cannot be used reliably as the data for the dataset and should therefore be ignored.

  • MPSKIP:<station><number>skipcountchan1

    • The name of this channel is <station>-Skipcount

    • The channel is fairly useless, at it only records the number of values to record, and as this is an option of the EPICS MPSKIP device, this will never change during a scan module.

    • Hence, when mapping, the corresponding dataset should be ignored and removed from the list of datasets to be mapped.

There is always a counter dataset Counter-mot present that increments within an average loop in the skip scan module. While this is an axis, it should be used for the data of the evedata.evefile.entities.data.SkipData dataset, as it is the only reliable dataset to determine the boundaries of each individual average loop.

Crucial parameters need currently to be added manually as a monitor and hence reside in the device section of the HDF5 file. These include:

  • MPSKIP:<station><number>detector

    • This contains the PV (neither the name nor the XML-ID!) of the detector channel used to trigger the skip event.

  • MPSKIP:<station><number>limit

    • This contains the lower limit the detector channel value needs to overcome to start the comparison phase.

  • MPSKIP:<station><number>maxdev

    • This contains the maximum deviation two consecutive channel values are allowed to have in the comparison phase. Note, however, that not more than a given maximum number of values are recorded. This maximum value is set by an additional counter motor axis in the scan module, hence the information is not available from the HDF5 file, but can only be inferred from the scan description contained in the SCML file.

  • MPSKIP:<station><number>skipcount

    • This is the number of values that should be recorded once the comparison phase has started.

  • MPSKIP:<station><number>reset

    • This is an actual monitor toggling between “execute” and “reset” and used in the scan to stop the averaging process. However, for the data analysis, this is neither necessary nor useful.

    • This monitor should be removed from the list of monitors to be mapped.

Important

With the only exception of the reset monitor (due to it being present in the pre-scan phase), none of these monitors is guaranteed to be present. This means, however, that there are scans where crucial information cannot be inferred from the eveH5 files.

All the information needs to be mapped to the evedata.evefile.entities.data.SkipData and evedata.evefile.entities.metadata.SkipMetadata classes.

An additional dataset in the main section that could be removed from the list is SmCounter-det (SM-Counter), containing a global number of the scan module executed, with each individual execution of a scan module incrementing this number by one.

There is an additional complication when dealing with MPSKIP scans that needs to be taken into account in the mpskip module: Due to a bug in the EPICS MPSKIP implementation, sometimes (and sometimes quite often) less than the minimal number of data points to average over are recorded. In the current data processing routines, a special fix is introduced, creating the missing values such that these additional values don’t change the mean.

Note

It turned out that there are scans containing not only one, but several scan modules using the MPSKIP feature. Hence, it seems that not only needs the MPSKIP dataset to be split into as many datasets as there are scan modules with MPSKIP, but also the position list of the MPSKIP detector to be read already during version mapping, to get the information which positions belong to what scan module. Hence, the position lists of the respective scan modules need to be updated.

Currently the only chance of (easily) figuring out borders between scan modules using MPSKIP is to rely on a Delta PosCount of > 2. This would, however, fail if two nested scan module blocks with the inner scan module using MPSKIP would directly follow each other.

As the positions for each of the MPSKIP modules need to be calculated anyway during mapping in the VersionMapper class, the individual MPSKIP datasets should get added a SelectPositions preprocessing step with the respective positions.

Fundamental change of eveH5 schema with v8

It is anticipated that based on the experience with the data model implemented within the evedata package, the schema of the eveH5 files will change dramatically with the new version v8. Overarching design principles of the schema overhaul include:

  • Much more explicit markup of the device types represented by the individual HDF5 datasets.

  • Parameters/options of devices are part of the HDF5 dataset of the respective device.

    • Parameters/options static within a scan module appear as attributes of the HDF5 datasets.

    • Parameters/options that potentially change with ech individual recorded data point are represented as additional columns in the HDF5 dataset.

  • Removing of the chain c1 that was never and will never be used.

For details, see the eveH5 Schema overview page, and particulary the section on eveH5 v8.

Taken together, this restructuring of the eveH5 schema most probably means that the mapper for v8 does not have much in common with the mappers for the previous versions, as this is a major change.

Module documentation

class evedata.evefile.controllers.version_mapping.VersionMapperFactory

Bases: object

Factory for obtaining the correct version mapper object.

There are different versions of the schema underlying the eveH5 files. Hence, mapping the contents of an eveH5 file to the data model of the evedata package requires to get the correct mapper for the specific version. This is the typical use case for the factory pattern.

eveh5

Python object representation of an eveH5 file

Type:

evedata.evefile.boundaries.eveh5.HDF5File

Raises:

ValueError – Raised if no eveh5 object is present

Examples

Using the factory is pretty simple. There are actually two ways how to set the eveh5 attribute – either explicitly or when calling the get_mapper() method of the factory:

factory = VersionMapperFactory()
factory.eveh5 = eveh5_object
mapper = factory.get_mapper()
factory = VersionMapperFactory()
mapper = factory.get_mapper(eveh5=eveh5_object)

In both cases, mapper will contain the correct mapper object, and eveh5_object contains the Python object representation of an eveH5 file.

get_mapper(eveh5=None)

Return the correct mapper for a given eveH5 file.

For convenience, the returned mapper has its VersionMapper.source attribute already set to the eveh5 object used to get the mapper for.

Parameters:

eveh5 (evedata.evefile.boundaries.eveh5.HDF5File) – Python object representation of an eveH5 file

Returns:

mapper – Mapper used to map the eveH5 file contents to evedata structures.

Return type:

VersionMapper

Raises:
class evedata.evefile.controllers.version_mapping.VersionMapper

Bases: object

Mapper for mapping the eveH5 file contents to evedata structures.

This is the base class for all version-dependent mappers. Given that there are different versions of the eveH5 schema, each version gets handled by a distinct mapper subclass.

To get an object of the appropriate class, use the VersionMapperFactory factory.

source

Python object representation of an eveH5 file

Type:

evedata.evefile.boundaries.eveh5.HDF5File

destination

High(er)-level evedata structure representing an eveH5 file

Type:

evedata.evefile.boundaries.evefile.EveFile

datasets2map_in_main

Names of the datasets in the main section not yet mapped.

In order to not have to check all datasets several times, this list contains only those datasets not yet mapped. Hence, every private mapping method removes those names from the list it handled successfully.

Type:

list

datasets2map_in_snapshot

Names of the datasets in the snapshot section not yet mapped.

In order to not have to check all datasets several times, this list contains only those datasets not yet mapped. Hence, every private mapping method removes those names from the list it handled successfully.

Type:

list

datasets2map_in_monitor

Names of the datasets in the monitor section not yet mapped.

Note that the monitor section is usually termed “device”.

In order to not have to check all datasets several times, this list contains only those datasets not yet mapped. Hence, every private mapping method removes those names from the list it handled successfully.

Type:

list

Raises:

ValueError – Raised if either source or destination are not provided

Examples

Although the VersionMapper class is not meant to be used directly, its use is prototypical for all the concrete mappers:

mapper = VersionMapper()
mapper.map(source=eveh5, destination=evefile)

Usually, you will obtain the correct mapper from the VersionMapperFactory. In this case, the returned mapper has its source attribute already set for convenience:

factory = VersionMapperFactory()
mapper = factory.get_mapper(eveh5=eveh5)
mapper.map(destination=evefile)
map(source=None, destination=None)

Map the eveH5 file contents to evedata structures.

Parameters:
Raises:

ValueError – Raised if either source or destination are not provided

static get_hdf5_dataset_importer(dataset=None, mapping=None)

Get an importer object for HDF5 datasets with properties set.

Data are loaded on demand, not already when initially loading the eveH5 file. Hence, the need for a mechanism to provide the relevant information where to get the relevant data from and how. Different versions of the underlying eveH5 schema differ even in whether all data belonging to one Data object are located in one HDF5 dataset or spread over multiple HDF5 datasets. In the latter case, individual importers are necessary for the separate HDF5 datasets.

As the VersionMapper class deals with each HDF5 dataset individually, some fundamental settings for the HDF5DataImporter are readily available. Additionally, the mapping parameter provides the information necessary to create the correct information in the HDF5DataImporter.mapping attribute.

Important

The keys in the dictionary provided via the mapping parameter are integers, not strings, as usual for dictionaries. This allows to directly use the keys for indexing the tuple returned by numpy.dtype.names. To be explicit, here is an example:

dataset = HDF5Dataset()
importer_mapping = {
    0: "milliseconds",
    1: "data",
}
importer = self.get_hdf5_dataset_importer(
    dataset=dataset, mapping=importer_mapping
)

Of course, in reality you will not just instantiate an empty HDF5Dataset object, but have one available within your mapper.

Parameters:
  • dataset (evedata.evefile.boundaries.eveh5.HDF5Dataset) – Representation of an HDF5 dataset.

  • mapping (dict) –

    Table mapping HDF5 dataset columns to data class attributes.

    Note: The keys in this dictionary are integers, not strings, as usual for dictionaries. This allows to directly use the keys for indexing the tuple returned by numpy.dtype.names.

Returns:

importer – HDF5 dataset importer

Return type:

evedata.evefile.entities.data.HDF5DataImporter

static get_dataset_name(dataset=None)

Get the name of an HDF5 dataset.

The name here refers to the last part of the path within the HDF5 file, i.e. the part after the last slash.

Parameters:

dataset (evedata.evefile.boundaries.eveh5.HDF5Dataset) – Representation of an HDF5 dataset.

Returns:

name – Name of the HDF5 dataset

Return type:

str

static set_basic_metadata(hdf5_item=None, dataset=None)

Set the basic metadata of a dataset from an HDF5 item.

The metadata attributes id, name, access_mode, and pv are set.

Parameters:
class evedata.evefile.controllers.version_mapping.VersionMapperV5

Bases: VersionMapper

Mapper for mapping eveH5 v5 file contents to evedata structures.

More description comes here…

Important

EveH5 files of version v5 and earlier do not contain a date and time for the end of the measurement. Hence, the corresponding attribute File.metadata.end is set to the UNIX start date (1970-01-01T00:00:00). Thus, with these files, it is not possible to automatically calculate the duration of the measurement.

source

Python object representation of an eveH5 file

Type:

evedata.evefile.boundaries.eveh5.HDF5File

destination

High(er)-level evedata structure representing an eveH5 file

Type:

evedata.evefile.boundaries.evefile.File

Raises:

ValueError – Raised if either source or destination are not provided

Examples

Mapping a given eveH5 file to the evedata structures is the same for each of the mappers:

mapper = VersionMapperV5()
mapper.map(source=eveh5, destination=evefile)

Usually, you will obtain the correct mapper from the VersionMapperFactory. In this case, the returned mapper has its source attribute already set for convenience:

factory = VersionMapperFactory()
mapper = factory.get_mapper(eveh5=eveh5)
mapper.map(destination=evefile)
static get_dataset_name(dataset=None)

Get the name of an HDF5 dataset.

The name here refers to the last part of the path within the HDF5 file, i.e. the part after the last slash.

Parameters:

dataset (evedata.evefile.boundaries.eveh5.HDF5Dataset) – Representation of an HDF5 dataset.

Returns:

name – Name of the HDF5 dataset

Return type:

str

static get_hdf5_dataset_importer(dataset=None, mapping=None)

Get an importer object for HDF5 datasets with properties set.

Data are loaded on demand, not already when initially loading the eveH5 file. Hence, the need for a mechanism to provide the relevant information where to get the relevant data from and how. Different versions of the underlying eveH5 schema differ even in whether all data belonging to one Data object are located in one HDF5 dataset or spread over multiple HDF5 datasets. In the latter case, individual importers are necessary for the separate HDF5 datasets.

As the VersionMapper class deals with each HDF5 dataset individually, some fundamental settings for the HDF5DataImporter are readily available. Additionally, the mapping parameter provides the information necessary to create the correct information in the HDF5DataImporter.mapping attribute.

Important

The keys in the dictionary provided via the mapping parameter are integers, not strings, as usual for dictionaries. This allows to directly use the keys for indexing the tuple returned by numpy.dtype.names. To be explicit, here is an example:

dataset = HDF5Dataset()
importer_mapping = {
    0: "milliseconds",
    1: "data",
}
importer = self.get_hdf5_dataset_importer(
    dataset=dataset, mapping=importer_mapping
)

Of course, in reality you will not just instantiate an empty HDF5Dataset object, but have one available within your mapper.

Parameters:
  • dataset (evedata.evefile.boundaries.eveh5.HDF5Dataset) – Representation of an HDF5 dataset.

  • mapping (dict) –

    Table mapping HDF5 dataset columns to data class attributes.

    Note: The keys in this dictionary are integers, not strings, as usual for dictionaries. This allows to directly use the keys for indexing the tuple returned by numpy.dtype.names.

Returns:

importer – HDF5 dataset importer

Return type:

evedata.evefile.entities.data.HDF5DataImporter

map(source=None, destination=None)

Map the eveH5 file contents to evedata structures.

Parameters:
Raises:

ValueError – Raised if either source or destination are not provided

static set_basic_metadata(hdf5_item=None, dataset=None)

Set the basic metadata of a dataset from an HDF5 item.

The metadata attributes id, name, access_mode, and pv are set.

Parameters:
class evedata.evefile.controllers.version_mapping.VersionMapperV6

Bases: VersionMapperV5

Mapper for mapping eveH5 v6 file contents to evedata structures.

The only difference to the previous version v5: Times for start and now even end of a measurement are available and are mapped as datetime.datetime objects onto the File.metadata.start and File.metadata.end attributes, respectively.

Note

Previous to v6 eveH5 files, no end date/time of the measurement was available, hence no duration of the measurement can be calculated.

source

Python object representation of an eveH5 file

Type:

evedata.evefile.boundaries.eveh5.HDF5File

destination

High(er)-level evedata structure representing an eveH5 file

Type:

evedata.evefile.boundaries.evefile.File

Raises:

ValueError – Raised if either source or destination are not provided

Examples

Mapping a given eveH5 file to the evedata structures is the same for each of the mappers:

mapper = VersionMapperV6()
mapper.map(source=eveh5, destination=evefile)

Usually, you will obtain the correct mapper from the VersionMapperFactory. In this case, the returned mapper has its source attribute already set for convenience:

factory = VersionMapperFactory()
mapper = factory.get_mapper(eveh5=eveh5)
mapper.map(destination=evefile)
static get_dataset_name(dataset=None)

Get the name of an HDF5 dataset.

The name here refers to the last part of the path within the HDF5 file, i.e. the part after the last slash.

Parameters:

dataset (evedata.evefile.boundaries.eveh5.HDF5Dataset) – Representation of an HDF5 dataset.

Returns:

name – Name of the HDF5 dataset

Return type:

str

static get_hdf5_dataset_importer(dataset=None, mapping=None)

Get an importer object for HDF5 datasets with properties set.

Data are loaded on demand, not already when initially loading the eveH5 file. Hence, the need for a mechanism to provide the relevant information where to get the relevant data from and how. Different versions of the underlying eveH5 schema differ even in whether all data belonging to one Data object are located in one HDF5 dataset or spread over multiple HDF5 datasets. In the latter case, individual importers are necessary for the separate HDF5 datasets.

As the VersionMapper class deals with each HDF5 dataset individually, some fundamental settings for the HDF5DataImporter are readily available. Additionally, the mapping parameter provides the information necessary to create the correct information in the HDF5DataImporter.mapping attribute.

Important

The keys in the dictionary provided via the mapping parameter are integers, not strings, as usual for dictionaries. This allows to directly use the keys for indexing the tuple returned by numpy.dtype.names. To be explicit, here is an example:

dataset = HDF5Dataset()
importer_mapping = {
    0: "milliseconds",
    1: "data",
}
importer = self.get_hdf5_dataset_importer(
    dataset=dataset, mapping=importer_mapping
)

Of course, in reality you will not just instantiate an empty HDF5Dataset object, but have one available within your mapper.

Parameters:
  • dataset (evedata.evefile.boundaries.eveh5.HDF5Dataset) – Representation of an HDF5 dataset.

  • mapping (dict) –

    Table mapping HDF5 dataset columns to data class attributes.

    Note: The keys in this dictionary are integers, not strings, as usual for dictionaries. This allows to directly use the keys for indexing the tuple returned by numpy.dtype.names.

Returns:

importer – HDF5 dataset importer

Return type:

evedata.evefile.entities.data.HDF5DataImporter

map(source=None, destination=None)

Map the eveH5 file contents to evedata structures.

Parameters:
Raises:

ValueError – Raised if either source or destination are not provided

static set_basic_metadata(hdf5_item=None, dataset=None)

Set the basic metadata of a dataset from an HDF5 item.

The metadata attributes id, name, access_mode, and pv are set.

Parameters:
class evedata.evefile.controllers.version_mapping.VersionMapperV7

Bases: VersionMapperV6

Mapper for mapping eveH5 v7 file contents to evedata structures.

The only difference to the previous version v6: the attribute Simulation has beem added on the file root level and is mapped as a Boolean value onto the File.metadata.simulation attribute.

source

Python object representation of an eveH5 file

Type:

evedata.evefile.boundaries.eveh5.HDF5File

destination

High(er)-level evedata structure representing an eveH5 file

Type:

evedata.evefile.boundaries.evefile.File

Raises:

ValueError – Raised if either source or destination are not provided

Examples

Mapping a given eveH5 file to the evedata structures is the same for each of the mappers:

mapper = VersionMapperV7()
mapper.map(source=eveh5, destination=evefile)

Usually, you will obtain the correct mapper from the VersionMapperFactory. In this case, the returned mapper has its source attribute already set for convenience:

factory = VersionMapperFactory()
mapper = factory.get_mapper(eveh5=eveh5)
mapper.map(destination=evefile)
static get_dataset_name(dataset=None)

Get the name of an HDF5 dataset.

The name here refers to the last part of the path within the HDF5 file, i.e. the part after the last slash.

Parameters:

dataset (evedata.evefile.boundaries.eveh5.HDF5Dataset) – Representation of an HDF5 dataset.

Returns:

name – Name of the HDF5 dataset

Return type:

str

static get_hdf5_dataset_importer(dataset=None, mapping=None)

Get an importer object for HDF5 datasets with properties set.

Data are loaded on demand, not already when initially loading the eveH5 file. Hence, the need for a mechanism to provide the relevant information where to get the relevant data from and how. Different versions of the underlying eveH5 schema differ even in whether all data belonging to one Data object are located in one HDF5 dataset or spread over multiple HDF5 datasets. In the latter case, individual importers are necessary for the separate HDF5 datasets.

As the VersionMapper class deals with each HDF5 dataset individually, some fundamental settings for the HDF5DataImporter are readily available. Additionally, the mapping parameter provides the information necessary to create the correct information in the HDF5DataImporter.mapping attribute.

Important

The keys in the dictionary provided via the mapping parameter are integers, not strings, as usual for dictionaries. This allows to directly use the keys for indexing the tuple returned by numpy.dtype.names. To be explicit, here is an example:

dataset = HDF5Dataset()
importer_mapping = {
    0: "milliseconds",
    1: "data",
}
importer = self.get_hdf5_dataset_importer(
    dataset=dataset, mapping=importer_mapping
)

Of course, in reality you will not just instantiate an empty HDF5Dataset object, but have one available within your mapper.

Parameters:
  • dataset (evedata.evefile.boundaries.eveh5.HDF5Dataset) – Representation of an HDF5 dataset.

  • mapping (dict) –

    Table mapping HDF5 dataset columns to data class attributes.

    Note: The keys in this dictionary are integers, not strings, as usual for dictionaries. This allows to directly use the keys for indexing the tuple returned by numpy.dtype.names.

Returns:

importer – HDF5 dataset importer

Return type:

evedata.evefile.entities.data.HDF5DataImporter

map(source=None, destination=None)

Map the eveH5 file contents to evedata structures.

Parameters:
Raises:

ValueError – Raised if either source or destination are not provided

static set_basic_metadata(hdf5_item=None, dataset=None)

Set the basic metadata of a dataset from an HDF5 item.

The metadata attributes id, name, access_mode, and pv are set.

Parameters: