Introduction

There is an 'early' request from the I&T analysis and DC2 background rejection efforts to reproduce Navid's DC1 dataserver to give access to (1) pruned ntuples and (2) 'peeled' trees (ie selected by run and event).

Pruner

The user would provide a TCut based on the merit tuple variables and that the server would locate all the merit files for a given pipeline task, apply the cut and assemble an output file of successful events.

Peeler

The user would provide a list of (run, event) identifiers to the server, which would again locate the files, then the events and assemble files containing the tree events.

Implementation Details

Tony is assuming a quick-to-write web interface to Root macros will be set up to supply the input and then submit the job to batch. It would then alert the user where to look for output files to ftp home.

Existing macros

I have in my user cvs area Root classes which can query the pipeline db to find datasets, and which can apply a TCut to them.

Here is the interface for finding datasets:

/// find datasets by task name, dataset name and a run range (rc=0 for success)
 int selectDatasets(char* taskName, char* datasetName, int runMin=0, int runMax=0);

 /// find datasets by task name, dataset name and a list of runs (rc=0 for success
 int selectDatasets(char* taskName, char* datasetName, std::vector<int> &runList);

 /// create a TChain based on the selectDataSets query: supply the tree name
 int makeChain(char* treeName);
 TChain* getChain() {return m_datasetsChain;};

So, one can give a run range or supply a vector of runs. One supplies it a tree name and then can create a TChain from the found datasets.

The pruning class has the following interface:

  pruneTuple(TChain* c, char* newFileName = "ntuple-prune.root", char* cut = "");

 /// find datasets by task name, dataset namde and a run range (rc=0 for success)

 int prune(UInt_t maxPerFile=200000);

wherein one constructs pruneTuple with the TChain obtained from pipelineDatasets and a cut, then does the pruning. If the output file is too large, pruneTuple will create some smaller segments.

There is an example of use in the pipeline doc.

Peeling events

There are existing macros in the RootAnalysis package that do the peeling.

PruneRunEvent is a prime candidate, as well as the pruneTree macro in src/utilityMacros/.

The peeler will presumably need to accept an input ascii file of (run, event) identifiers.

Implementation details
  • should optionally handle all 3 trees (maybe even Merit tuple too?)
  • the chains should have all the same files and in the same order!
  • the trees are indexed by (run, event). One can use the tree->GetEntryNumberWithIndex(run, evt) function to access events
  • should try to optimize branch usage, just read run, event branches while searching.
  • should the peeler interface to pipelineDatasets to more efficiently handle the list of runs?

Timescale etc

We have promised an initial version of this server by the end of March. With the bulk of the work being in the Root macros, it is they who primarily need to be ready within two weeks.

Tom Glanzman has agreed to work on completing the Root side.

  • No labels