evedata.measurement.controllers.joining module

Ensure data and axes values are commensurate and compatible.

For each motor axis and detector channel, in the original eveH5 file only those values appear—together with a “position” (PosCount) value—that have actually been set or measured. Hence, the number of values (i.e., the length of the data vector) will generally be different for different devices. To be able to plot arbitrary data against each other, the corresponding data vectors need to be commensurate. If this is not the case, they need to be brought to the same dimensions (i.e., “joined”, originally somewhat misleadingly termed “filled”).

To be exact, being commensurate is only a necessary, but not a sufficient criterion, as not only the shape needs to be commensurate, but the indices (in this case the positions) be identical.

A bit of history

In the previous interface (EveFile), there are four “fill modes” available for data: NoFill, LastFill, NaNFill, LastNaNFill. From the documentation of eveFile:

NoFill

“Use only data from positions where at least one axis and one channel have values.”

Actually, not a filling, but mathematically an intersection, or, in terms of relational databases, an SQL INNER JOIN. In any case, data are reduced.

LastFill

“Use all channel data and fill in the last known position for all axes without values.”

Similar to an SQL LEFT JOIN with data left and axes right, but additionally explicitly setting the missing axes values in the join to the last known axis value.

NaNFill

“Use all axis data and fill in NaN for all channels without values.”

Similar to an SQL LEFT JOIN with axes left and data right. To be exact, the NULL values of the join operation will be replaced by NaN.

LastNaNFill

“Use all data and fill in NaN for all channels without values and fill in the last known position for all axes without values.”

Similar to an SQL OUTER JOIN, but additionally explicitly setting the missing axes values in the join to the last known axis value and replacing the NULL values of the join operation by NaN.

Furthermore, for the Last*Fill modes, snapshots are inspected for axes values that are newer than the last recorded axis in the main/standard section.

Note that none of the fill modes guarantees that there are no NaNs (or comparable null values) in the resulting data.

Important

The IDL Cruncher seems to use LastNaNFill combined with applying some “dirty” fixes to account for scans using MPSKIP and those scans “monitoring” a motor position via a pseudo-detector. The EveHDF class (DS) uses LastNaNFill as a default as well but does not apply some additional post-processing.

Shall fill modes be something to change in a viewer? And which fill modes are used in practice (and do we have any chance to find this out)?

How to deal with missing values?

Depending on the concrete situation, there may be no value available to fill a gap in an axis. Hence, how to deal with this situation?

Numeric values

For numeric values, some kind of “NaN” (not a number) could be used.

For NumPy, only floats can have a dedicated “NaN”, but no other dtype. Hence, in case of missing values, a masked array ( numpy.ma.MaskedArray) is used and numpy.ma.masked set explicitly for those missing values. For all practical purposes, this should work similar to the numpy.nan. In particular, when trying to plot a numpy.ma.MaskedArray, the masked values are simply ignored. For further details of how to work with masked arrays, see the numpy.ma documentation.

Non-numeric values

First of all: Does this situation occur in reality? Yes, there are axes with non-numeric values. But are these axes ever joined? If so, some textual value such as “N/A” (not available) may be used.

Note

The default fill value of a numpy.ma.MaskedArray is N/A, and this is (only) used when calling numpy.ma.MaskedArray.filled(). Otherwise, the masked values are in most cases simply ignored. For an overview of the default fill values of masked arrays, see the numpy.ma.MaskedArray.fill_value attribute.

Join modes currently implemented

Currently, there is exactly one join mode implemented:

  • AxesLastFill

    Inflate axes to data dimensions using last for missing value.

    If no previous axes value is available, convert the data into a numpy.ma.MaskedArray object and mask the value.

    This mode is equivalent to the “LastFill” mode described above.

For developers

To implement additional join modes, create a class inheriting from the Join base class and implement the actual joining in the private method _join().

There is a factory class JoinFactory that you can ask to get a Join object:

factory = JoinFactory()
join = factory.get_join(mode="AxesLastFill")

This would return an AxesLastFill object. For further details, see the JoinFactory documentation.

Module documentation

class evedata.measurement.controllers.joining.Join(measurement=None)

Bases: object

Base class for joining data.

For each motor axis and detector channel, in the original eveH5 file only those values appear—together with a “position counter” (PosCount) value—that have actually been set or measured. Hence, the number of values (i.e., the length of the data vector) will generally be different for different devices. To be able to plot arbitrary data against each other, the corresponding data vectors need to be commensurate. If this is not the case, they need to be brought to the same dimensions (i.e., “joined”, originally somewhat misleadingly termed “filled”).

The main “quantisation” axis of the values for a device and the common reference is the list of positions. Hence, to join, first of all the lists of positions are compared, and gaps handled accordingly.

As there are different strategies how to deal with gaps in the positions list, generally, there will be different subclasses of the Join class dealing each with a particular strategy.

measurement

Measurement the Join should be performed for.

Although joining is carried out for a small subset of the device data of a measurement, additional information from the measurement may be necessary to perform the task.

Type:

evedata.measurement.boundaries.measurement.Measurement

Parameters:

measurement (evedata.measurement.boundaries.measurement.Measurement) – Measurement the join should be performed for.

Examples

Usually, joining takes place in the set_data() and set_axes() methods. Furthermore, a Measurement object will have a Join instance of the appropriate type. To join data, in this case of a detector channel and a motor axis, call join() with the respective parameters:

join = Join(measurement=my_measurement)
data, *axes = join.join(
    data=("SimChan:01", None),
    axes=(("SimMot:02", None)),
)

Note the use of two variables for the return of the method, and in particular the use of *axes ensuring that axes is always a list and takes all remaining return arguments, regardless of their count.

Important

While it may be tempting to use this class on your own and work further with the returned arrays, you will lose all metadata and context. Hence, simply don’t. Just use the interface provided by Measurement instead.

join(data=None, axes=None, scan_module='')

Harmonise data.

The main “quantisation” axis of the values for a device and the common reference is the list of positions. Hence, to join, first of all the lists of positions are compared, and gaps handled accordingly.

As there are different strategies how to deal with gaps in the positions list, generally, there will be different subclasses of the Join class dealing each with a particular strategy.

Parameters:
  • data (tuple | list) –

    Name of the device and its attribute data are taken from.

    If the attribute is set to None, data will be used instead.

  • axes (tuple | list) –

    Names of the devices and their attribute axes values are taken from.

    If an attribute is set to None, data will be used instead.

    Each element of the tuple/list is itself a two-element tuple/list with name and attribute.

  • scan_module (str) – Scan module ID the device belongs to

Returns:

data – Joined data and axes values.

The first element is always the data, the following the (variable number of) axes. To separate the two and always get a list of axes, you may call it like this:

data, *axes = join.join(...)

Return type:

list

Raises:
class evedata.measurement.controllers.joining.AxesLastFill(measurement=None)

Bases: Join

Inflate axes to data dimensions using last for missing value.

This was previously known as “LastFill” mode and was described as “Use all channel data and fill in the last known position for all axes without values.” In SQL terms (relational database), this would be similar to a left join with data left and axes right, but additionally explicitly setting the missing axes values in the join to the last known axis value.

While the terms “channel” and “axis” have different meanings than in context of the joining module, the behaviour is qualitatively similar:

  • The device used as “data” is taken as reference and its values are not changed.

  • The values of devices used as “axes” are inflated to the same dimension as the data.

  • For values originally missing for an axis, the last value of the previous position is used.

  • If no previous value exists for a missing value, the data are converted into a numpy.ma.MaskedArray object and the values masked with numpy.ma.masked.

  • The snapshots are checked for values corresponding to the axis, and if present, are taken into account.

Of course, as in all cases, the (integer) positions are used as common reference for the values of all devices.

Important

If there is more than one snapshot, always the newest snapshot previous to the current axis position should be used. Check whether this is implemented already.

measurement

Measurement the join should be performed for.

Although joining is carried out for a small subset of the device data of a measurement, additional information from the measurement may be necessary to perform the task, e.g., the snapshots.

Type:

evedata.measurement.boundaries.measurement.Measurement

Parameters:

measurement (evedata.measurement.boundaries.measurement.Measurement) – Measurement the join should be performed for.

Examples

See the Join base class for examples – and replace the class name accordingly.

join(data=None, axes=None, scan_module='')

Harmonise data.

The main “quantisation” axis of the values for a device and the common reference is the list of positions. Hence, to join, first of all the lists of positions are compared, and gaps handled accordingly.

As there are different strategies how to deal with gaps in the positions list, generally, there will be different subclasses of the Join class dealing each with a particular strategy.

Parameters:
  • data (tuple | list) –

    Name of the device and its attribute data are taken from.

    If the attribute is set to None, data will be used instead.

  • axes (tuple | list) –

    Names of the devices and their attribute axes values are taken from.

    If an attribute is set to None, data will be used instead.

    Each element of the tuple/list is itself a two-element tuple/list with name and attribute.

  • scan_module (str) – Scan module ID the device belongs to

Returns:

data – Joined data and axes values.

The first element is always the data, the following the (variable number of) axes. To separate the two and always get a list of axes, you may call it like this:

data, *axes = join.join(...)

Return type:

list

Raises:
class evedata.measurement.controllers.joining.JoinFactory(measurement=None)

Bases: object

Factory for getting the correct join object.

For background on the need for joining, see the documentation of the entire joining module, and of the Join class.

Given a decision which type of join you would like to apply to your data, this factory class allows you to get the correct join instance without hassle. And you can even change your mind in between and don’t have to change any code—the whole idea behind factories.

measurement

Measurement the join should be performed for.

Type:

evedata.measurement.boundaries.measurement.Measurement

Parameters:

measurement (evedata.measurement.boundaries.measurement.Measurement) – Measurement the join should be performed for.

Examples

Getting a join object is as simple as calling a single method on the factory object:

factory = JoinFactory()
join = factory.get_join(mode="AxesLastFill")

This will provide you with the appropriate AxesLastFill instance.

As joins need a Measurement object, you can set one to the factory, and it will get added automatically to the join instance for you:

factory = JoinFactory(measurement=my_measurement)
join = factory.get_join(mode="AxesLastFill")

Thus, when used from within a Measurement object, set the measurement attribute to self.

get_join(mode='Join')

Obtain a Join instance for a particular mode.

If no mode is provided, this defaults to the base class. As the Join does not implement any functionality, this is rather useless.

If the measurement attribute is set, it is automatically set in the Join instance returned.

Parameters:

mode (str) –

Join mode to return a Join instance for.

Default: “Join”

Returns:

join – Join instance

Return type:

Join