Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 4.0
  • xrootd
    • Requires more bookkeeping because it has no "ls".
    • It's not a drop-in replacement for a traditional filesystem, and requires rethinking assumptions.
    • Could stage input or output files in parallel.
  • Combine steps
    • Reduces ability to roll back errors.
    • Increases latency.
  • Varying crumb size
    • Makes lots of small crumbs, so digi files get read many times.
  • Varying chunk size
    • Lots of small chunks mean more jobs are reading in parallel at the start of processing, but it does not increase the amount of data that's read.
    • If there are more chunks than available cores, that automatically throttles I/O somewhat.
  • Use scratch disks
    • Need to be able to leave files on scratch for a couple of hours without having a process running.
    • Need to be able to copy files between batch machines with a process only at the receive end of the transfer.
    • Scalable
      • But, maybe, it doesn't have to scale, it just has to work.  It's not like we're going to get mentioned on slashdot and suddenly have100x the data flowing in.
  • Not stage files stored on AFS to/from scratch
    • AFS' internal caching means that we are copying the data twice.
    • This may be particularly useful for recon, where crumb jobs don't use the whole input file.
  • PROOF
    • Seems deeply tied to xroot - needs xrootd to run on the batch host?
  • Event collections
    • during processing
      • avoid crumb-to-chunk merges
      • do SVAC at crumb level
    • long-term
      • potentially don't need to merge anything, just store chunk- and crumb-level files 

Using AFS without staging seems very promising to me, I intend to try that first.

 We might need a lot less temporary storage if we put chunk-level files from incomplete runs on long-term storage.  Or add another level of storage - crumb (AFS or SSD), new chunk (AFS or SSD), old chunk (NFS), run (xrootd).  Old chunks are only used in the merge steps, which are relatively undemanding of I/O (it's all serial).