With Data Management as part of the strategic plan for Lab-wide computing, a workshop would be a good stepping stone to bringing the data management community together and identifying common tools and directions to move in. Data management is here loosely defined as moving and storing files, cataloguing and accessing them.
In broad brush:
- get more detail of existing solutions at SLAC: Fermi, LCLS, BABAR...
- interview projects in need of DM solutions - eg cosmology computing; maybe someone like Todd Martinez; SSRL, LCLS users etc.
- collect the people from the first bullet to discuss common solutions that could be recommended to pursue.
- think about evolution of current DM solutions to a more common base over some timeframe?
Time frame: Feb 2012
Duration: 2 days
Tony Johnson has listed these items (with the odd edit) as potentially of interest:
- encompass disk and tape storage technologies and file-systems (lustre, xrootd etc).
- The aspects of DAQ concerned with pushing those bits out
- Transfer from DAQ to long term storage
- The data access patterns, analysis and reconstruction bandwidth
- Data storage formats (root, fits, hdf5, databases, ...). Optimization
of data storage formats for SSD - Need for scaling storage to support massive parallelization (multiple
simultaneous write streams, massively parallel read streams) - Data Catalogs, eg iRODS etc
- Offsite access to data, e.g. for Fermi download manager, or Globus Online
- Skimming tools, e.g. for Fermi skimmer, astro server
- Long-term data archive, "Cloud" storage