Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

This document attempts to summarize what work and resources that would be needed to make an effective EXO 200 analysis environment on the SRCF computing facility, under the assumption that this needs to be done without the user doing the analysis needing a SLAC account. 

There are 4 5 major requirements, each of which is addressed below, namely access to:

  • Raw data -- obtained at WIPP but immediately copied to and archived at SLAC
  • Processed data -- obtained by running several stages of reconstruction and analysis algorithms on the raw data and also stored at SLAC. The algorithms are often refined and so the processed data is frequently regenerated from the raw data by re-running the algorithms on the SLAC compute farm. 
  • The source code used for reconstruction and analysis, both to understand what it currently does and to work on improvements (which is often an integral part of a student's research).
  • Compute cycles to run analysis on the processed data, typically obtained from the SLAC compute cluster
  • Access to collaboration documents and wiki where analysis techniques are discussed, as well as EXO websites where data quality is summarized

In the following discussion we have assumed that the user is able to get an SRCF account and compute resources through EXO's affiliation with Stanford (Giorgio Gratta, PI).

...

The EXO 200 source code is stored in a subversion code repository at SLAC. Up to now we have used "svn+ssh" access to the repository since this is the simplest mechanism when all users already have SLAC unix accounts. For other experiments we have used access direct subversion access, which only requires that uses users have an entry in a list of "virtual" accounts, very similar to the "collaboration enclave account" described above. Assuming that this access is acceptable then setting up similar access for EXO account should be straightforward.

Work required: <1 day (not including and any required cyber security review)

...

  1. Copy the entire data set. This has the disadvantage the considerable disk space would be required at SRCF (currently ~150TB). This would also require some mechanism for synchronizing the data when it is updated at SLAC.
  2. Give (a subset) of the SRCF computers direct access into to the EXO-200 data at SLAC. This would require using the (existing but currently unused?) fiber between building 50 and SRCF, and configuring some kind of trusted access between the computer centers
  3. Use the same xrootd access that we currently use to make the EXO data accessible at NERSC. This would almost certainly work, but may require more work than option 2, and may not give such good performance. We also need to understand the data access limitations on the SRCF batch farm and ensure that it can access the SLAC proxy server (or perhaps run a second data access node at SRCF).

...

Setting up such a facility does seem possible, and would require fairly modest resources, but nonetheless these resources would need to be identified (presumably with some budget code to cover the work). In addition to make sure ensure the required work happens in a timely way some effort would need to be put into spearheading and coordinating the work.

...