Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Wiki Markup
h2. Refereneces

From Core Mnutes March 20, 2007: 

{color:#009999}ROOT data indexing:{color} (Eric C.) ROOT-based metadata scheme (used by BABAR) containing pointers to locate all known data for a given event.  The basic idea involves storing {File, Tree, Entry} in an external table, pointed to by a single pointer in an NTuple (index file), but with some additional flexibility (works with xrootd, including when data has been migrated to tape).  Tools to read & copy the event data also provided. See https://confluence.slac.stanford.edu/download/attachments/20011/Event_Collections.pdf.  BABAR code is all online in their CVS repository, module KanEvent. This scheme might find application in GLAST for interleave, skimming, and analysis scenarios.

h2. Relational DB option 

In a nutshell: while at the collaboration meeting it became apparent that  the goals for this "system" were rapidly changing and the scope of the concept  is increasing dramatically. For example, the pipeline people would like to be  able to categorize events (pass CNO filter, pass MIP filter, GCR event, etc.)  and they think it would be natural to just write out an event/run number to do  this, then use the "system" to read the events back. In thinking about it a bit  more, it seems to me that the problem neatly divides into two pieces. One piece  is the part that given a run and event number (which I am told are the unique  identifiers for all events) returns the information on where to find the actual  data. The other piece is the code that, given the run and event number and the  information on where to find the data, returns the actual data requested (which  can be various root trees - mc, relation, recon, digi, etc., or ntuples,  or...).

...


My argument is that the first piece is best done with a relational  database. With a relational database you would use the run/event number as the  key and the enter, only once, all the information on where to find the various  bits of data associated with it. In addition, you can also have a few more bits  of information which will further categorize the event which can be used during  a query to identify the event in some particular way. I think this type of  system will be far more extensible and much easier to manage than a pile of root  files which would try to do this same sort of thing. In addition, it would be  straightforward for the pipeline guys to hook into this automatically to fill  it. I also think it would be very easy to transport the database to other  installations which might be repositories of large datasets (e.g.  Lyon).

...


In any case, I think this approach also neatly divides the problem into two  pieces which may well make it much easier to implement.

...


And, finally, I also discovered that Joanne has already provided all the  tools necessary for implementing such an approach, including some very nice gui  tools for looking at stuff that is in there. There would be cost involved in  understanding her stuff and then wrapping what we wanted to do around the  outside of it but this would be far simpler than trying to invent something  ourselves.

...


I have the BaBar code downloaded on my laptop. I had started to think about  building it but realized I needed some include files. When I asked Tom where I  might find them he cautioned me that I was beginning to pull on a very long  string and it might be best to not try to do that. So... whatever we decide to  do, it sounds like our best approach may be to take the concept but do our own  implementation.

...


Tracy

h2. Use Cases


h3. Level 0 Interleave


h3. Pruner Skimming


h3. User Analysis