evedata.evefile.controllers.version_mapping module
Mapping eveH5 contents to the data structures of the evedata package.
There are different versions of the schema underlying the eveH5 files. Hence, mapping the contents of an eveH5 file to the data model of the evedata package requires to get the correct mapper for the specific version. This is the typical use case for the factory pattern.
Users of the module hence will typically only obtain a
VersionMapperFactory
object to get the correct mappers for individual
files. Furthermore, “users” basically boils down to the EveFile
class. Therefore, users of
the evedata package usually do not interact directly with any of the
classes provided by this module.
Overview
Being version agnostic with respect to eveH5 and SCML schema versions is a
central aspect of the evedata package. This requires facilities mapping
the actual eveH5 files to the data model provided by the entities
technical layer of the evefile subpackage. The File
facade obtains
the correct VersionMapper
object via the
VersionMapperFactory
, providing an HDF5File
resource object to the
factory. It is the duty of the factory to obtain the “version” attribute
from the HDF5File
object (explicitly getting the attributes of the root group of the
HDF5File
object).
Fig. 32 Class hierarchy of the evedata.evefile.controllers.version_mapping
module, providing the functionality to map different eveH5 file
schemas to the data structure provided by the EveFile
class. The factory
will be used to get the correct mapper for a given eveH5 file.
For each eveH5 schema version, there exists an individual
VersionMapperVx
class dealing with the version-specific mapping.
The idea behind the Mapping
class is to provide simple mappings for
attributes and alike that need not be hard-coded and can be stored
externally, e.g. in YAML files. This would make it easier to account
for (simple) changes.
For each eveH5 schema version, there exists an individual
VersionMapperVx
class dealing with the version-specific mapping. That
part of the mapping common to all versions of the eveH5 schema takes place
in the VersionMapper
parent class, e.g. removing the chain. The
idea behind the Mapping
class is to provide simple mappings for
attributes and alike that can be stored externally, e.g. in YAML files.
This would make it easier to account for (simple) changes.
Mapping tasks for eveH5 schema up to v7
Given the quite different overall philosophy of the current eveH5 file schema (up to version v7) and the data model provided by the evedata package, there is a lot of tasks for the mappers to be done.
What follows is a summary of the different aspects, for the time being not divided for the different formats (up to v7):
Map attributes of
/
and/c1
to the file metadata. ✓Convert monitor datasets from the
device
group toMonitorData
objects. ✓We probably need to create subclasses for the different monitor datasets, at least distinguishing between numeric and non-numeric values.
Map
/c1/meta/PosCountTimer
toTimestampData
object. ✓Starting with eveH5 v5: Map
/LiveComment
toLogMessage
objects. ✓Filter all datasets from the
main
section, with different goals:Map array data to
ArrayChannelData
objects (HDF5 groups having an attributeDeviceType
set toChannel
). ✓Distinguish between MCA and scope data (at least). ✗
Map additional datasets in main section (and snapshot). ✓
Handle MPSKIP channel(s) if present. ✓
Only present at SX700 and EUVR stations.
Three channels and a few monitors
Map to
evedata.evefile.entities.data.SkipData
andevedata.evefile.entities.metadata.SkipMetadata
Map all axis datasets to
AxisData
objects. ✓How to distinguish between axes with and without encoders? ✗
Read channels with RBV and replace axis values with RBV. ✗
Most probably, the corresponding channel has the same name (not XML-ID, though!) as the axis, but with suffix
_RBV
, and can thus be identified.In case of axes with encoders, there may be additional datasets present, e.g., those with suffix
_Enc
.In this case, instead of
NonencodedAxisData
, anAxisData
object needs to be created. (Currently, onlyAxisData
objects are created, what is a mistake as well…)
How to deal with pseudo-axes used as options in channel datasets? Do we need to deal with axes later? ✗
Distinguish between single point and area data, and map area data to
AreaChannelData
objects. (✓)Distinguish between scientific and sample cameras. ✓
Which dataset is the “main” dataset for scientific cameras? ✗
Starting with eve v1.39, it is
TIFF1:chan1
, before, this is less clear, and there might not exist a dataset containing filenames with full paths, but only numbers.
Map sample camera datasets. ✓
Figure out which single point data have been redefined between scan modules, and split data accordingly. Map the data to
SinglePointChannelData
,AverageChannelData
, andIntervalChannelData
, respectively. ✗Hint: Getting the shape of an HDF5 dataset is a cheap operation and does not require reading the actual data, as the information is contained in the metadata of the HDF5 dataset. This should allow for additional checking whether a dataset has been redefined.
If the number of (the sum of) positions differ, the channel has been redefined. However, the average or interval settings may have changed between scan modules as well, and this can only be figured out by actually reading the data. How to handle this situation? Split datasets only upon reading the data, if necessary?
Take care of normalized channel data and treat them accordingly.
Map the additional data for average and interval channel data provided in the respective HDF5 groups to
AverageChannelData
andIntervalChannelData
objects, respectively. ✓Map normalized channel data (and the data provided in the respective HDF5 groups) to
NormalizedChannelData
. ✓Map all remaining HDF5 datasets that belong to one of the already mapped data objects (i.e., variable options) to their respective attributes. (Should have been done already)
Map all HDF5 datasets remaining (if any) to data objects corresponding to their respective data type. (Could there be any?)
Add all data objects to the
data
attribute of theEveFile
object. (Has been done during mapping already.)
Filter all datasets from the
snapshot
section, with different goals:Map all HDF5 datasets that belong to one of the data objects in the
data
attribute of theEveFile
object to their respective attributes.Map all HDF5 datasets remaining (if any) to data objects corresponding to their respective data type.
Add all data objects to the
snapshots
attribute of theEveFile
object. ✓
Most probably, not all these tasks can be inferred from the contents of an eveH5 file alone. In this case, additional mapping tables, eventually perhaps even on a per-measurement-station level, are necessary.
Other tasks not in the realm of the version mappers, but part of the
evedata.evefile.controllers
subpackage, are:
Separating 0D data that have been redefined within a scan (single point, average, interval) – sure about this one? see above
Mapping scans using the EPICS MPSKIP feature to record individual values for actual average detectors to
AverageChannelData
objects.
Todo
In light of the newly added scan modules layer and the necessary mapping of datasets to scan modules: Where and how to check whether creating position (count)s during reading the SCML did work (consistency check) and where to actually distribute the datasets to the scan modules?
Probably the best way is to first map all datasets from main to dataset objects within the mapper, and only afterwards (deep)copy these dataset objects where necessary and distribute them to the scan modules, adding the preprocessing step selecting position counts to the respective importer(s).
When exactly the MPSKIP scans are dealt with needs to be decided.
Definitely, the general mapping of datasets needs to be done first,
as only this creates and maps the special
SkipData
dataset
necessary to carry out the tasks of the mpskip
module.
Questions to address
How were the log messages/live comments saved before v5?
How to deal with options that are monitored? Check whether they change for a given channel/axis and if so, expand them (“fill”) for each PosCount of the corresponding channel/axis, and otherwise set as scalar attribute?
How to deal with the situation that not all actual data read from eveH5 are numeric. Of course, non-numeric data cannot be plotted. But how to distinguish sensibly?
The
evedata.evefile.entities.data
module provides some distinct classes for this, at least for nowNonnumericChannelData
.
Notes on mapping MCA datasets
MCA data themselves are stored as single dataset per spectrum in an HDF5
group, and such group can be uniquely identified by having attributes,
and an attribute DeviceType
set to Channel
. Furthermore, he PV of a
given MCA can be inferred from the Access
attribute of the HDF5 group.
Why not using the name of the MCA HDF5 group for obtaining the PV? The
group typically has chan1
added without separator straight to the PV
name, but the Access
attribute reveals the full PV with added .VAL
attribute.
As all additional options follow directly the EPICS MCA record, and the dataset names can be mapped to the PVs of the MCA record, a direct mapping of datasets in the main and snapshot sections could be carried out. In this case, it seems not necessary to explicitly check the PV names of the individual datasets, as the datasets all have the PV attributes as their last part. Note that there are different and variable numbers of ROI channels and corresponding datasets available (up to 32 according to the EPICS MCA record, but probably <10 at PTB).
How to map the values of the snapshot section to the options of the
MCAChannelData
and
MCAChannelROIData
classes? Check whether they have changed, and if not, use the first value?
How to deal with the situation where the values in the snapshot dataset
have changed? This would most probably mean that the MCA has been used
with different settings in different scan modules of the scan and would
need to be split into different datasets. However, this is only accessible
once the data have been read. Again, two scenarios would be possible: (i)
postpone the whole procedure to the data import in the
MCAChannelData
class, or (ii) load the snapshot data during mapping, as this should
usually only be small datasets, and deal with the differing values already
here.
Notes on mapping camera datasets
Most probably, camera datasets can be identified by having (at least) two
colons in their name. Furthermore, the second-right part between two
colons should be one of TIFF1
or cam1
for scientific cameras and
uvc1
for sample cameras.
Having once identified one dataset belonging to a camera, all related datasets can be identified by the identical part before the first colon. Note that this criterion is not valid for other datasets not belonging to cameras.
Identifying the “main” dataset for a camera is another task, as over time, this has changed as well, from storing image numbers to storing (full) filenames.
How to map the values of the snapshot section to the respective camera classes? The same ideas as for the MCA datasets apply here, too - and probably more generally for all snapshot datasets, at least those where corresponding devices exist in the main section.
Notes on mapping MPSKIP channels
MPSKIP channels are (currently) only present at SX700 and EUVR stations. This is a special EPICS detector used to record individual values to average over and at the same time a series of axes RBVs.
In a typical scan, there are (up to) three channel datasets as well as a
series of monitor datasets present. Fortunately, the PV naming scheme of the
MPSKIP device is generic, the base name is always:
MPSKIP:<station><number>
. The actual names (as seen in the GUI) are
much less consistent, though. The three channel datasets are:
MPSKIP:<station><number>chan1
The name of this channel is
SkipDetektor<station>
.The values of this channel would theoretically be the counts, but unfortunately the channel seems to count wrongly. Hence, the values (and the entire dataset) should be ignored.
MPSKIP:<station><number>counterchan1
The name of this channel is
<station>-Scounter
.The values of this channel are the counts, with “1” being repeated if the comparison does not succeed.
This channel is not present in all scans, hence cannot be used reliably as the data for the dataset and should therefore be ignored.
MPSKIP:<station><number>skipcountchan1
The name of this channel is
<station>-Skipcount
The channel is fairly useless, at it only records the number of values to record, and as this is an option of the EPICS MPSKIP device, this will never change during a scan module.
Hence, when mapping, the corresponding dataset should be ignored and removed from the list of datasets to be mapped.
There is always a counter dataset Counter-mot
present that increments
within an average loop in the skip scan module. While this is an axis,
it should be used for the data of the
evedata.evefile.entities.data.SkipData
dataset, as it is the only
reliable dataset to determine the boundaries of each individual average loop.
Crucial parameters need currently to be added manually as a monitor and
hence reside in the device
section of the HDF5 file. These include:
MPSKIP:<station><number>detector
This contains the PV (neither the name nor the XML-ID!) of the detector channel used to trigger the skip event.
MPSKIP:<station><number>limit
This contains the lower limit the detector channel value needs to overcome to start the comparison phase.
MPSKIP:<station><number>maxdev
This contains the maximum deviation two consecutive channel values are allowed to have in the comparison phase. Note, however, that not more than a given maximum number of values are recorded. This maximum value is set by an additional counter motor axis in the scan module, hence the information is not available from the HDF5 file, but can only be inferred from the scan description contained in the SCML file.
MPSKIP:<station><number>skipcount
This is the number of values that should be recorded once the comparison phase has started.
MPSKIP:<station><number>reset
This is an actual monitor toggling between “execute” and “reset” and used in the scan to stop the averaging process. However, for the data analysis, this is neither necessary nor useful.
This monitor should be removed from the list of monitors to be mapped.
Important
With the only exception of the reset
monitor (due to it being
present in the pre-scan phase), none of these monitors is guaranteed
to be present. This means, however, that there are scans where crucial
information cannot be inferred from the eveH5 files.
All the information needs to be mapped to the
evedata.evefile.entities.data.SkipData
and
evedata.evefile.entities.metadata.SkipMetadata
classes.
An additional dataset in the main
section that could be removed from the
list is SmCounter-det
(SM-Counter), containing a global number of the
scan module executed, with each individual execution of a scan module
incrementing this number by one.
There is an additional complication when dealing with MPSKIP scans that
needs to be taken into account in the mpskip
module: Due to a bug in the EPICS
MPSKIP implementation, sometimes (and sometimes quite often) less than the
minimal number of data points to average over are recorded. In the current
data processing routines, a special fix is introduced, creating the missing
values such that these additional values don’t change the mean.
Note
It turned out that there are scans containing not only one, but several scan modules using the MPSKIP feature. Hence, it seems that not only needs the MPSKIP dataset to be split into as many datasets as there are scan modules with MPSKIP, but also the position list of the MPSKIP detector to be read already during version mapping, to get the information which positions belong to what scan module. Hence, the position lists of the respective scan modules need to be updated.
Currently the only chance of (easily) figuring out borders between scan modules using MPSKIP is to rely on a Delta PosCount of > 2. This would, however, fail if two nested scan module blocks with the inner scan module using MPSKIP would directly follow each other.
As the positions for each of the MPSKIP modules need to be calculated
anyway during mapping in the VersionMapper
class, the individual
MPSKIP datasets should get added a SelectPositions
preprocessing step with the respective positions.
Fundamental change of eveH5 schema with v8
It is anticipated that based on the experience with the data model
implemented within the evedata
package, the schema of the eveH5 files
will change dramatically with the new version v8. Overarching design
principles of the schema overhaul include:
Much more explicit markup of the device types represented by the individual HDF5 datasets.
Parameters/options of devices are part of the HDF5 dataset of the respective device.
Parameters/options static within a scan module appear as attributes of the HDF5 datasets.
Parameters/options that potentially change with ech individual recorded data point are represented as additional columns in the HDF5 dataset.
Removing of the chain
c1
that was never and will never be used.
For details, see the eveH5 Schema overview page, and particulary the section on eveH5 v8.
Taken together, this restructuring of the eveH5 schema most probably means that the mapper for v8 does not have much in common with the mappers for the previous versions, as this is a major change.
Module documentation
- class evedata.evefile.controllers.version_mapping.VersionMapperFactory
Bases:
object
Factory for obtaining the correct version mapper object.
There are different versions of the schema underlying the eveH5 files. Hence, mapping the contents of an eveH5 file to the data model of the evedata package requires to get the correct mapper for the specific version. This is the typical use case for the factory pattern.
- eveh5
Python object representation of an eveH5 file
- Raises:
ValueError – Raised if no eveh5 object is present
Examples
Using the factory is pretty simple. There are actually two ways how to set the eveh5 attribute – either explicitly or when calling the
get_mapper()
method of the factory:factory = VersionMapperFactory() factory.eveh5 = eveh5_object mapper = factory.get_mapper()
factory = VersionMapperFactory() mapper = factory.get_mapper(eveh5=eveh5_object)
In both cases,
mapper
will contain the correct mapper object, andeveh5_object
contains the Python object representation of an eveH5 file.- get_mapper(eveh5=None)
Return the correct mapper for a given eveH5 file.
For convenience, the returned mapper has its
VersionMapper.source
attribute already set to theeveh5
object used to get the mapper for.- Parameters:
eveh5 (
evedata.evefile.boundaries.eveh5.HDF5File
) – Python object representation of an eveH5 file- Returns:
mapper – Mapper used to map the eveH5 file contents to evedata structures.
- Return type:
- Raises:
ValueError – Raised if no eveh5 object is present
AttributeError – Raised if no matching
VersionMapper
class can be found
- class evedata.evefile.controllers.version_mapping.VersionMapper
Bases:
object
Mapper for mapping the eveH5 file contents to evedata structures.
This is the base class for all version-dependent mappers. Given that there are different versions of the eveH5 schema, each version gets handled by a distinct mapper subclass.
To get an object of the appropriate class, use the
VersionMapperFactory
factory.- source
Python object representation of an eveH5 file
- destination
High(er)-level evedata structure representing an eveH5 file
- datasets2map_in_main
Names of the datasets in the main section not yet mapped.
In order to not have to check all datasets several times, this list contains only those datasets not yet mapped. Hence, every private mapping method removes those names from the list it handled successfully.
- Type:
- datasets2map_in_snapshot
Names of the datasets in the snapshot section not yet mapped.
In order to not have to check all datasets several times, this list contains only those datasets not yet mapped. Hence, every private mapping method removes those names from the list it handled successfully.
- Type:
- datasets2map_in_monitor
Names of the datasets in the monitor section not yet mapped.
Note that the monitor section is usually termed “device”.
In order to not have to check all datasets several times, this list contains only those datasets not yet mapped. Hence, every private mapping method removes those names from the list it handled successfully.
- Type:
- Raises:
ValueError – Raised if either source or destination are not provided
Examples
Although the
VersionMapper
class is not meant to be used directly, its use is prototypical for all the concrete mappers:mapper = VersionMapper() mapper.map(source=eveh5, destination=evefile)
Usually, you will obtain the correct mapper from the
VersionMapperFactory
. In this case, the returned mapper has itssource
attribute already set for convenience:factory = VersionMapperFactory() mapper = factory.get_mapper(eveh5=eveh5) mapper.map(destination=evefile)
- map(source=None, destination=None)
Map the eveH5 file contents to evedata structures.
- Parameters:
source (
evedata.evefile.boundaries.eveh5.HDF5File
) – Python object representation of an eveH5 filedestination (
evedata.evefile.boundaries.evefile.EveFile
) – High(er)-level evedata structure representing an eveH5 file
- Raises:
ValueError – Raised if either source or destination are not provided
- static get_hdf5_dataset_importer(dataset=None, mapping=None)
Get an importer object for HDF5 datasets with properties set.
Data are loaded on demand, not already when initially loading the eveH5 file. Hence, the need for a mechanism to provide the relevant information where to get the relevant data from and how. Different versions of the underlying eveH5 schema differ even in whether all data belonging to one
Data
object are located in one HDF5 dataset or spread over multiple HDF5 datasets. In the latter case, individual importers are necessary for the separate HDF5 datasets.As the
VersionMapper
class deals with each HDF5 dataset individually, some fundamental settings for theHDF5DataImporter
are readily available. Additionally, themapping
parameter provides the information necessary to create the correct information in theHDF5DataImporter.mapping
attribute.Important
The keys in the dictionary provided via the
mapping
parameter are integers, not strings, as usual for dictionaries. This allows to directly use the keys for indexing the tuple returned bynumpy.dtype.names
. To be explicit, here is an example:dataset = HDF5Dataset() importer_mapping = { 0: "milliseconds", 1: "data", } importer = self.get_hdf5_dataset_importer( dataset=dataset, mapping=importer_mapping )
Of course, in reality you will not just instantiate an empty
HDF5Dataset
object, but have one available within your mapper.- Parameters:
dataset (
evedata.evefile.boundaries.eveh5.HDF5Dataset
) – Representation of an HDF5 dataset.mapping (
dict
) –Table mapping HDF5 dataset columns to data class attributes.
Note: The keys in this dictionary are integers, not strings, as usual for dictionaries. This allows to directly use the keys for indexing the tuple returned by
numpy.dtype.names
.
- Returns:
importer – HDF5 dataset importer
- Return type:
- static get_dataset_name(dataset=None)
Get the name of an HDF5 dataset.
The name here refers to the last part of the path within the HDF5 file, i.e. the part after the last slash.
- Parameters:
dataset (
evedata.evefile.boundaries.eveh5.HDF5Dataset
) – Representation of an HDF5 dataset.- Returns:
name – Name of the HDF5 dataset
- Return type:
- static set_basic_metadata(hdf5_item=None, dataset=None)
Set the basic metadata of a dataset from an HDF5 item.
The metadata attributes
id
,name
,access_mode
, andpv
are set.- Parameters:
hdf5_item (
evedata.evefile.boundaries.eveh5.HDF5Item
) – Representation of an HDF5 item.dataset (
evedata.evefile.entities.data.Data
) – Data object the metadata should be set for
- class evedata.evefile.controllers.version_mapping.VersionMapperV5
Bases:
VersionMapper
Mapper for mapping eveH5 v5 file contents to evedata structures.
More description comes here…
Important
EveH5 files of version v5 and earlier do not contain a date and time for the end of the measurement. Hence, the corresponding attribute
File.metadata.end
is set to the UNIX start date (1970-01-01T00:00:00). Thus, with these files, it is not possible to automatically calculate the duration of the measurement.- source
Python object representation of an eveH5 file
- destination
High(er)-level evedata structure representing an eveH5 file
- Type:
evedata.evefile.boundaries.evefile.File
- Raises:
ValueError – Raised if either source or destination are not provided
Examples
Mapping a given eveH5 file to the evedata structures is the same for each of the mappers:
mapper = VersionMapperV5() mapper.map(source=eveh5, destination=evefile)
Usually, you will obtain the correct mapper from the
VersionMapperFactory
. In this case, the returned mapper has itssource
attribute already set for convenience:factory = VersionMapperFactory() mapper = factory.get_mapper(eveh5=eveh5) mapper.map(destination=evefile)
- static get_dataset_name(dataset=None)
Get the name of an HDF5 dataset.
The name here refers to the last part of the path within the HDF5 file, i.e. the part after the last slash.
- Parameters:
dataset (
evedata.evefile.boundaries.eveh5.HDF5Dataset
) – Representation of an HDF5 dataset.- Returns:
name – Name of the HDF5 dataset
- Return type:
- static get_hdf5_dataset_importer(dataset=None, mapping=None)
Get an importer object for HDF5 datasets with properties set.
Data are loaded on demand, not already when initially loading the eveH5 file. Hence, the need for a mechanism to provide the relevant information where to get the relevant data from and how. Different versions of the underlying eveH5 schema differ even in whether all data belonging to one
Data
object are located in one HDF5 dataset or spread over multiple HDF5 datasets. In the latter case, individual importers are necessary for the separate HDF5 datasets.As the
VersionMapper
class deals with each HDF5 dataset individually, some fundamental settings for theHDF5DataImporter
are readily available. Additionally, themapping
parameter provides the information necessary to create the correct information in theHDF5DataImporter.mapping
attribute.Important
The keys in the dictionary provided via the
mapping
parameter are integers, not strings, as usual for dictionaries. This allows to directly use the keys for indexing the tuple returned bynumpy.dtype.names
. To be explicit, here is an example:dataset = HDF5Dataset() importer_mapping = { 0: "milliseconds", 1: "data", } importer = self.get_hdf5_dataset_importer( dataset=dataset, mapping=importer_mapping )
Of course, in reality you will not just instantiate an empty
HDF5Dataset
object, but have one available within your mapper.- Parameters:
dataset (
evedata.evefile.boundaries.eveh5.HDF5Dataset
) – Representation of an HDF5 dataset.mapping (
dict
) –Table mapping HDF5 dataset columns to data class attributes.
Note: The keys in this dictionary are integers, not strings, as usual for dictionaries. This allows to directly use the keys for indexing the tuple returned by
numpy.dtype.names
.
- Returns:
importer – HDF5 dataset importer
- Return type:
- map(source=None, destination=None)
Map the eveH5 file contents to evedata structures.
- Parameters:
source (
evedata.evefile.boundaries.eveh5.HDF5File
) – Python object representation of an eveH5 filedestination (
evedata.evefile.boundaries.evefile.EveFile
) – High(er)-level evedata structure representing an eveH5 file
- Raises:
ValueError – Raised if either source or destination are not provided
- static set_basic_metadata(hdf5_item=None, dataset=None)
Set the basic metadata of a dataset from an HDF5 item.
The metadata attributes
id
,name
,access_mode
, andpv
are set.- Parameters:
hdf5_item (
evedata.evefile.boundaries.eveh5.HDF5Item
) – Representation of an HDF5 item.dataset (
evedata.evefile.entities.data.Data
) – Data object the metadata should be set for
- class evedata.evefile.controllers.version_mapping.VersionMapperV6
Bases:
VersionMapperV5
Mapper for mapping eveH5 v6 file contents to evedata structures.
The only difference to the previous version v5: Times for start and now even end of a measurement are available and are mapped as
datetime.datetime
objects onto theFile.metadata.start
andFile.metadata.end
attributes, respectively.Note
Previous to v6 eveH5 files, no end date/time of the measurement was available, hence no duration of the measurement can be calculated.
- source
Python object representation of an eveH5 file
- destination
High(er)-level evedata structure representing an eveH5 file
- Type:
evedata.evefile.boundaries.evefile.File
- Raises:
ValueError – Raised if either source or destination are not provided
Examples
Mapping a given eveH5 file to the evedata structures is the same for each of the mappers:
mapper = VersionMapperV6() mapper.map(source=eveh5, destination=evefile)
Usually, you will obtain the correct mapper from the
VersionMapperFactory
. In this case, the returned mapper has itssource
attribute already set for convenience:factory = VersionMapperFactory() mapper = factory.get_mapper(eveh5=eveh5) mapper.map(destination=evefile)
- static get_dataset_name(dataset=None)
Get the name of an HDF5 dataset.
The name here refers to the last part of the path within the HDF5 file, i.e. the part after the last slash.
- Parameters:
dataset (
evedata.evefile.boundaries.eveh5.HDF5Dataset
) – Representation of an HDF5 dataset.- Returns:
name – Name of the HDF5 dataset
- Return type:
- static get_hdf5_dataset_importer(dataset=None, mapping=None)
Get an importer object for HDF5 datasets with properties set.
Data are loaded on demand, not already when initially loading the eveH5 file. Hence, the need for a mechanism to provide the relevant information where to get the relevant data from and how. Different versions of the underlying eveH5 schema differ even in whether all data belonging to one
Data
object are located in one HDF5 dataset or spread over multiple HDF5 datasets. In the latter case, individual importers are necessary for the separate HDF5 datasets.As the
VersionMapper
class deals with each HDF5 dataset individually, some fundamental settings for theHDF5DataImporter
are readily available. Additionally, themapping
parameter provides the information necessary to create the correct information in theHDF5DataImporter.mapping
attribute.Important
The keys in the dictionary provided via the
mapping
parameter are integers, not strings, as usual for dictionaries. This allows to directly use the keys for indexing the tuple returned bynumpy.dtype.names
. To be explicit, here is an example:dataset = HDF5Dataset() importer_mapping = { 0: "milliseconds", 1: "data", } importer = self.get_hdf5_dataset_importer( dataset=dataset, mapping=importer_mapping )
Of course, in reality you will not just instantiate an empty
HDF5Dataset
object, but have one available within your mapper.- Parameters:
dataset (
evedata.evefile.boundaries.eveh5.HDF5Dataset
) – Representation of an HDF5 dataset.mapping (
dict
) –Table mapping HDF5 dataset columns to data class attributes.
Note: The keys in this dictionary are integers, not strings, as usual for dictionaries. This allows to directly use the keys for indexing the tuple returned by
numpy.dtype.names
.
- Returns:
importer – HDF5 dataset importer
- Return type:
- map(source=None, destination=None)
Map the eveH5 file contents to evedata structures.
- Parameters:
source (
evedata.evefile.boundaries.eveh5.HDF5File
) – Python object representation of an eveH5 filedestination (
evedata.evefile.boundaries.evefile.EveFile
) – High(er)-level evedata structure representing an eveH5 file
- Raises:
ValueError – Raised if either source or destination are not provided
- static set_basic_metadata(hdf5_item=None, dataset=None)
Set the basic metadata of a dataset from an HDF5 item.
The metadata attributes
id
,name
,access_mode
, andpv
are set.- Parameters:
hdf5_item (
evedata.evefile.boundaries.eveh5.HDF5Item
) – Representation of an HDF5 item.dataset (
evedata.evefile.entities.data.Data
) – Data object the metadata should be set for
- class evedata.evefile.controllers.version_mapping.VersionMapperV7
Bases:
VersionMapperV6
Mapper for mapping eveH5 v7 file contents to evedata structures.
The only difference to the previous version v6: the attribute
Simulation
has beem added on the file root level and is mapped as a Boolean value onto theFile.metadata.simulation
attribute.- source
Python object representation of an eveH5 file
- destination
High(er)-level evedata structure representing an eveH5 file
- Type:
evedata.evefile.boundaries.evefile.File
- Raises:
ValueError – Raised if either source or destination are not provided
Examples
Mapping a given eveH5 file to the evedata structures is the same for each of the mappers:
mapper = VersionMapperV7() mapper.map(source=eveh5, destination=evefile)
Usually, you will obtain the correct mapper from the
VersionMapperFactory
. In this case, the returned mapper has itssource
attribute already set for convenience:factory = VersionMapperFactory() mapper = factory.get_mapper(eveh5=eveh5) mapper.map(destination=evefile)
- static get_dataset_name(dataset=None)
Get the name of an HDF5 dataset.
The name here refers to the last part of the path within the HDF5 file, i.e. the part after the last slash.
- Parameters:
dataset (
evedata.evefile.boundaries.eveh5.HDF5Dataset
) – Representation of an HDF5 dataset.- Returns:
name – Name of the HDF5 dataset
- Return type:
- static get_hdf5_dataset_importer(dataset=None, mapping=None)
Get an importer object for HDF5 datasets with properties set.
Data are loaded on demand, not already when initially loading the eveH5 file. Hence, the need for a mechanism to provide the relevant information where to get the relevant data from and how. Different versions of the underlying eveH5 schema differ even in whether all data belonging to one
Data
object are located in one HDF5 dataset or spread over multiple HDF5 datasets. In the latter case, individual importers are necessary for the separate HDF5 datasets.As the
VersionMapper
class deals with each HDF5 dataset individually, some fundamental settings for theHDF5DataImporter
are readily available. Additionally, themapping
parameter provides the information necessary to create the correct information in theHDF5DataImporter.mapping
attribute.Important
The keys in the dictionary provided via the
mapping
parameter are integers, not strings, as usual for dictionaries. This allows to directly use the keys for indexing the tuple returned bynumpy.dtype.names
. To be explicit, here is an example:dataset = HDF5Dataset() importer_mapping = { 0: "milliseconds", 1: "data", } importer = self.get_hdf5_dataset_importer( dataset=dataset, mapping=importer_mapping )
Of course, in reality you will not just instantiate an empty
HDF5Dataset
object, but have one available within your mapper.- Parameters:
dataset (
evedata.evefile.boundaries.eveh5.HDF5Dataset
) – Representation of an HDF5 dataset.mapping (
dict
) –Table mapping HDF5 dataset columns to data class attributes.
Note: The keys in this dictionary are integers, not strings, as usual for dictionaries. This allows to directly use the keys for indexing the tuple returned by
numpy.dtype.names
.
- Returns:
importer – HDF5 dataset importer
- Return type:
- map(source=None, destination=None)
Map the eveH5 file contents to evedata structures.
- Parameters:
source (
evedata.evefile.boundaries.eveh5.HDF5File
) – Python object representation of an eveH5 filedestination (
evedata.evefile.boundaries.evefile.EveFile
) – High(er)-level evedata structure representing an eveH5 file
- Raises:
ValueError – Raised if either source or destination are not provided
- static set_basic_metadata(hdf5_item=None, dataset=None)
Set the basic metadata of a dataset from an HDF5 item.
The metadata attributes
id
,name
,access_mode
, andpv
are set.- Parameters:
hdf5_item (
evedata.evefile.boundaries.eveh5.HDF5Item
) – Representation of an HDF5 item.dataset (
evedata.evefile.entities.data.Data
) – Data object the metadata should be set for