Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migration of unmigrated content due to installation of a new plugin

A data catalog has been created to provide a navigable repository for ILC files. The database structure allows for arbitrary metadata to be associated with the files, and includes a download link.

...

A Java crawler is being developed to automatically add existing STDHEP and SLCIO files to the database, and use the filenames and event headers to collect metadata (e.g. energy, number of events, detector name) which can be stored alongside it.

The metadata collected (where available) is:-

  • Energy
  • Polarization
  • SLIC and GÉANT version
  • Number of events
  • Detector
  • A list of all the collections stored within a SLCIO file.

At present, the program simply takes each file supplied to it and generates a shell script of commands using the data catalog API's registerDataset method to add entries to the catalog.

...

Each file is then registered in the catalog. The final result can be seen at http://srs.slac.stanford.edu/DataCatalog/folder.jsp?folder=573739Image Removed.

Limitations

The process of parsing each file, and then adding them to the catalog is a lengthy one and for large directories will take several minutes.

There is also a bug which dumps several "Not creating second sensor" error reports to the command line every time the crawler checks a file's event header.