Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Preface

The Skimmer is also known to GLAST people as the Data Server Back End or Skimmer Back End. It has a command-like interface which can be used directly from a linux shell. If you skim your data thanks to a web interface, you are going through an additional layer known as the Data Server Front End, or Skimmer Web Application, or Skimmer Front-End. Here, you will only find the documentation of the back-end tool with a command-like interface, which we will call simply skimmer, but maybe this can also help you to understand the front-end layer and its web interface.

...

  1. Perl 5, which should be found with "/usr/bin/env perl".
  2. ROOT 5.10.00 to 5.18.00b : the user can specify $ROOTSYS to any ROOT release, and it will be used as is by the skimmer, but the only validated releases are 5.10.00, 5.14.00g, 5.16.00-gl1 and 5.18/00c-gl1 ; if not defined, the skimmer will search for $GLAST_EXT/ROOT/v5.10.00/root ; if $GLAST_EXT is not defined, it will be set to /afs/slac/g/glast/ground/GLAST_EXT/$CMTCONFIG ; if $CMTCONFIG is not defined, it will be set to rh9_gcc32.

What the skimmer basically do

The basic task of the skimmer is to take Glast ROOT files, containing ROOT trees, and produce similar output files with a subset of branches and events. The search for ROOT data files to be skimmed is called here mining. The eventual non-copy of some branches is called pruning. The copy of only a subset of events is called cutting.

...

  1. MAKE_FILE_LIST : establish the list of the input ROOT data files to be skimmed.
  2. MAKE_LIBRARY_LIST : eventually find out the release of the corresponding C++ code, and search for the associated shared libraries.
  3. MAKE_BRANCH_LIST : establish the list of branches to be duplicated.
  4. MAKE_EVENT_LIST : establish the list of events to be duplicated.
  5. SKIM : the actual skimming.
  6. CHECK : optional check of output data, which could take a long time to perform.

How to control the skimming job

As one can see in the steps given above, before the skimmer can proceed, it is collecting much information about the files to be skimmed, what they contain and what to extract. This is all tuned by some shell variables, and some of the information can come from an input ROOT CEL file (documented elsewhere) or from some textual parameter files, meant to be the textual flavor of the different subparts of a ROOT CEL.

...

One will find below the description of the parameter files and shell variables which are meaningfull for a skimmer job.
Worth to note, for each of the official skimming step given previously, there is a SK_DEBUG_* variable which can trigger the display of additionnal information about that specific step. Let's now see the details of each step.

Data files mining parameters

The list of input data files can be obtained from different sources :

...

No Format
SK_INPUT_CEL = ""
SK_INPUT_FILE_LIST = ""
SK_INPUT_TASK = ""
SK_RUN_MIN = 0
SK_RUN_MAX = 0
SK_OUTPUT_FILE_LIST=""
SK_DEBUG_FILE_LIST="false"

Shared libraries determination parameters

When managing data such as recon, mc and/or digis, the skimmer sometimes needs to load the corresponding C++ shared libraries. It needs the ones which were used when generating the data, compiled with the correct release. The list of those shared libraries can be provided by the user in a dedicated file, whose name is defined by variable SK_INPUT_LIBRARY_LIST. In this file, each line is the full path of a shared library, eventually prefixed by the data types associated with the library. If there is no such prefix, the library is to be loaded for any data type. Example of such a file :

...

No Format
SK_INPUT_LIBRARY_LIST=""
SK_EXPECTED_RELEASE=""
SK_LIBRARY_DIRS=""/nfs/farm/g/glast/u09/builds/rh9_gcc32:/nfs/farm/g/glast/u30/builds/rh9_gcc32:/afs/slac.stanford.edu/g/glast/ground/releases/rh9_gcc32opt"
SK_OUTPUT_LIBRARY_LIST=""
SK_DEBUG_LIBRARY_LIST="false"

Events cutting parameters

The list of selected events can be obtained from different sources :

...

No Format
SK_INPUT_CEL=""
SK_INPUT_EVENT_LIST=""
SK_TCUT_DATA_TYPE="merit"
SK_TCUT=""
SK_OUTPUT_EVENT_LIST=""
SK_DEBUG_EVENT_LIST="false"

Branches pruning parameters

The skimmer can also take into account a list of the branches to be activated or desactivated. This list is given through a file, whose full path is given by variable SK_INPUT_BRANCH_LIST. Each line should contains a data type prefix, the name of the tree, a + or a - (so to activate or desactivate respectively), and the specification of one or several branches (with the ROOT syntax). The lines are applied one after the other : you can desactivate all the branches of a given type with -*, then activate the only ones of interest. There is a first implicit +* for all the data types used in the skimming job (see SK_DATA_TYPES in next section). So, all the data types which are not explicitly in the branch list will have all their branches activated. Here is an example of such file :

...

No Format
SK_INPUT_BRANCH_LIST=""
SK_OUTPUT_BRANCH_LIST=""
SK_DEBUG_BRANCH_LIST=false

The actual final skimming

We are now to the point where to say which types of data we want to skim. This is said by shell variable SK_DATA_TYPES, which should be a ":" separated list of data types. The current recognized types can be found in the guide /Skimmer at SLAC/. If SK_DATA_TYPES is empty, a default value of "merit:mc:digi:recon" will be used.

...