Some Ideas

xrootd
- Requires more bookkeeping because it has no "ls".
- It's not a drop-in replacement for a traditional filesystem, and requires rethinking assumptions.
- Could stage input or output files in parallel.
Combine steps
- Reduces ability to roll back errors.
- Increases latency.
Varying crumb size
- Makes lots of small crumbs, so digi files get read many times.
Varying chunk size
- Lots of small chunks mean more jobs are reading in parallel at the start of processing, but it does not increase the amount of data that's read.
- If there are more chunks than available cores, that automatically throttles I/O somewhat.

Use scratch disks
- Need to be able to leave files on scratch for a couple of hours without having a process running.
- Need to be able to copy files between batch machines with a process only at the receive end of the transfer.
- Scalable
  - But, maybe, it doesn't have to scale, it just has to work. It's not like we're going to get mentioned on slashdot and suddenly have100x the data flowing in.

Not stage files stored on AFS to/from scratch
- AFS' internal caching means that we are copying the data twice.
- This may be particularly useful for recon, where crumb jobs don't use the whole input file.

PROOF
- Seems deeply tied to xroot - needs xrootd to run on the batch host?

Event collections
- during processing
  - avoid crumb-to-chunk merges
  - do SVAC at crumb level
- long-term
  - potentially don't need to merge anything, just store chunk- and crumb-level files

Using AFS without staging seems very promising to me, I intend to try that first.

We might need a lot less temporary storage if we put chunk-level files from incomplete runs on long-term storage. Or add another level of storage - crumb (AFS or SSD), new chunk (AFS or SSD), old chunk (NFS), run (xrootd). Old chunks are only used in the merge steps, which are relatively undemanding of I/O (it's all serial).

Space shortcuts

Child pages