Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Adding random access to LCIO files.

Table of Contents
minLevel2

Goals

Allow efficient access to specific events in LCIO files. Events should be selectable by

...

If the file is written with support for random access enabled then the first record in the file will be a LCIORandomAccess record which describes the entire file. This block will then point to one or more LCIORandomAccess records elsewhere in the file, each of which will have an associated LCIOIndex record. The LCIOIndex record contains the location of each record EventHeader/RunHeader record in the file.

Reading files with random access records.

TBD

Implementation notes

  • The purpose of having two types of records is that the (small) LCIORandomAccess records can be written uncompressed. This allows them to be read quickly, and updated after they are written using the SIO random access mechanism described below. The larger LCIOIndex records can be written with compression turned on since they never need to be updated once they are written.
  • The reason for having multiple LCIOIndex records in a file is to prevent the problem of having to store a potentially infinitely large index blocks in memory. By storing something like 10-100k records per index block performance to access records is still good, without huge memory usage even for very large (chains) of files. This also makes the task of appending to an existing files much easier.
  • Older LCIO readers should ignore/skip the LCIOIndex/LCIORandomAccess records.

Changes needed in SIO

Small changes are needed in the SIO library to support random access.

  • Changes for writing files
    • When creating a record return the location in the file of the record
    • Allow existing records to be overridden. New record must be exactly the same size as existing record (or possibly allow smaller records). If smaller records are not allowed the file format on disk is completely unchanged. If smaller records are allowed then some care is needed in interpreting record lengths, which may break older readers. Because size of compressed records cannot be predicted, replacing compressed records is not recommended (or allowed?)
    • Optionally allow space to be reserved for a future record (not required for LCIO).
  • Changes for reading
    • Allow record at a given location to be read.

A prototype updated version of the Java SIO library is available which incorporates these modifications. The documentation is here. Newly added methods are marked since 2.1.