You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Next »

Introduction

Matlab provides both a high level and a low level interface to the Hdf5 library. Functions at the high level include h5read and h5write for reading and writing from hdf files. Many of our datasets are one dimensional arrays of a compound type. A compound type is a well defined concept in Hdf5, it is like a C struct. In hdfview (a useful tool for viewing hdf5 files provided by the hdf5 group) they often look like two dimensional arrays, but the columns are really the field names of the compound type. When you read a dataset in with h5read each field is separated into its own 1D array. For instance, if the dataset looks like

    fieldA  fieldB
0   101     23.3
1   110     99.1
2   784     13.3

In hdfview, Matlab, will return a Matlab struct with two attributes:

dataset =

      fieldA: [3x1 uint16]
      fieldB: [3x1 float32]

Matlab Issue with Enums

When a Matlab user loads a dataset with h5read that is an array of enums, it returns a cell array of strings - so the user can work with the strings. However Matlab (as of version 2013b) fails to do this with our datasets where the enum is a field within a compound type. We have made Matlab aware of this problem and they understand the issue. Users for which enum to string translation is an important feature should feel free to contact Matlab and reference the Service request: 1-O0DQBH, enum field in compound type of hdf5 that we have made.

In the meantime, Matlab has provided some workaround code that we have reworked into a function users can use. You can download the file here:

translate_enums.m

or soon be able to get it from the src directory of the h5tools package of the analysis release: /reg/g/psdm/sw/releases/ana-current/h5tools/src (as of today -10/3/2013, it is not yet part of ana-current, but will be soon).  Once you have the file, add it to a directory in your Matlab search path, and call the function translate_enums in order to have the enum field translated into a string field. One would use this as follows:

filename='somefile.h5';
datasetname='/path/to/dataset';
ds = h5read(filename,datasetname);
ds = translate_enums(ds, filename, datasetname);

If ds is a dataset whose base type is a compound type, translate_enums will replace fields that are enums with their string counterparts. For example, suppose one had

  ds
     field1  [3x1 int16]
     field2  [3x1 float32]

where field1 was for an enum ONE=1, TWO=2, THREE=3 and was the array [2,1,3]. Then the output of translate_enums will be

ds
  field1  {3x1 cell array}
  field2  [3x1 float32]

where field1 is now

{ 'TWO', 'ONE', 'THREE' }

Performance

One challenge Matlab users face is how Matlab handles the large datasets in the LCLS files. If a dataset is to large to fit in memory, you will have to read a subset of the rows (h5read has optional arguments to do this).

  • No labels