Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Maven (3.0 or greater)
  • gcc (4.8 or greater)
  • CMake (3.0 or greater)
  • Python (2.7 or greater)
  • SLIC
  • ROOT (optional)

Bundled Dependencies

These dependencies are compiled and installed from source code within the project:

  • egs5 event generator
  • MadGraph 4 and MadGraph 5
  • StdHep library and tools (based on version 5.6.1)

Installed Dependencies

The installation procedure will automatically download and install the following dependencies:

Building the Project

Start by checking out the project from github:

...

Code Block
languagebash
make -j4 install

Running Job Scripts

Job Environment

A number of environment variables are required for hps-mc to function properly.

These need to be setup by sourcing this script before attempting to run any jobsBefore running job scripts, you need to setup the environment using a script generated during the build:

Code Block
languagebash
. hps-mc/install/bin/hps-mc-env.sh

The HPSJAVA_JAR variable will by default point to the copy built during the installation.  You can override this by setting it to point to a copy of the jar you want to use instead e.g.

Code Block
languagebash
export HPSJAVA_JAR=~/.m2/repository/org/hps/hps-distribution/4.0/hps-distribution-4.0-bin.jar

The SLIC application binary needs to be present in the environment (the SLIC environment is not managed directly by hps-mc), and you can check for this using:

Code Block
languagebash
which slic

If this application is not found then run the slic-env.sh to set it up before executing any hps-mc jobs.

ROOT is only currently used by one job script (tuple_job.py).  You need to set it up in your environment using the supplied thisroot.sh script for this job script to work.

Running Job Scripts

Running individual job scripts requires providing a JSON file with required parameters.

...

You will probably want to run jobs locally in a scratch directory, as they will tend to write out many files!

Input and Output Files

The job parameters may specify input files if the script uses them and optionally output file locations.

 

No Format
{                                                                          
    "input_files": {
        "events1.stdhep": "/path/to/events1.stdhep",
        "events2.stdhep": "/path/to/events2.stdhep",
    } 
    "output_files": {                      
        "events1.slcio": "my_events1.slcio",
        "events2.slcio": "my_events2.slcio"
    },
    "output_dir": "/path/to/outdir"                                                                                  
}     

In the above toy example, the files listed in input_files will be copied from the absolute path on the right to the file name on the left (format is "destination": "source").

The output files will be copied from the left hand path in the local scratch dir to the file name on the right (format is "source": "destination").

All output files will be copied to the directory listed under "output_dir", which can be an absolute or relative path.

To keep the names of the output files created by the job, simply list the same file name for the output file entries.

No Format
{                                                                            
    "output_files": {                      
        "events1.slcio": "events1.slcio",
        "events2.slcio": "events2.slcio"
    }
}     

This will make the job copy the file events1.slcio to a file with the same name in the output directory.

Creating Job Workflows

In order to run jobs on a batch system such as LSF or Auger, the job parameters need to be expanded into a workflow, which is a JSON file containing parameters for all the individual jobs.

...

This can be expanded into a workflow using the following command:

Code Block
languagebash
python hps-mc/python/hpsmc/workflow.py-workflow -n 1000 -r 1234 -w tritrig hps-mc/python/jobs/tritrig_job.py job.json

Now you should see a local file called tritrig.json which contains information for running 1000 jobs of this type.

The input files for a workflow may be supplied in one of two ways.

A file glob will supply multiple input files to the workflow, one per job.

Code Block
languagebash
"input_files" : {
    "beam.stdhep": "/not/a/real/path/beam*.stdhep"
}

Mutiple files can be supplied based on the following syntax.

Code Block
languagebash
"input_files" : {
    "beam.stdhep": {
        "/not/a/real/path/beam*.stdhep": 10
    }
}

This will expand into JSON parameters that include 10 files per job in the workflow.

Running Jobs on the Batch System

Automatically submitting jobs to the batch system requires that you have created a workflow from your job parameters (covered in last section).

The following commands use the script hps-mc-bsub to submit jobs to LSF (e.g. SLAC environment).

The command hps-mc-jsub should be used instead when submitting at JLab.

To submit all jobs in a workflow, execute a command similar to the following:

Code Block
languagebash
 hps-mc-bsub -l $PWD/logs ./tritrig.json

You can also submit only certain job ids using a syntax like this to list specific job IDs:

Code Block
languagebash
hps-mc-bsub -l $PWD/logs ./tritrig.json 1000 2000 [etc.]

Finally, it is possible to submit a range of job IDs:

Code Block
languagebash
hps-mc-bsub -l $PWD/logs -r 0:99 ./tritrig.json

This will submit all the jobs IDs from 0 to 99 in the workflow.

Project Structure

The main project has the following directory structure:

DirectoryContainsNotes
datadata fileshas run_params.json with beam parameters
generatorsevent generators 
generators/egs5egs5 event generator 
generators/madgraph4MadGraph4 generator 
generators/madgraph5MadGraph5 generator 
pythonPython scripts 
python/hpsmcPython framework scripts 
python/jobsPython job scripts 
python/testPython test scripts 
scriptsscripts (bash, csh, XML, etc.)Miscellaneous helper scripts and other scripts processed by CMake
scripts/mc_scriptsAuger based scriptsBackup of JLab Auger MC production scripts (not used by hps-mc)
scripts/run_paramsscripts for printing run paramsBackup of JLab scripts (not used by hps-mc)
scripts/MadGraphscripts for printing information from LHE filesBackup of JLab scripts (not used by hps-mc)

Additionally, the following directory structure is installed to CMAKE_INSTALL_PREFIX.

DirectoryContainsNotes
binexecutables and scripts 
libprogram libraries 
lib/pythonpython framework and scripts 
shareproject data 
share/detectorsdetector description files (LCDD)used when running SLIC
share/fieldmapsfull B-field mapsused when running SLIC

The bin dir contains a large number of scripts and binaries that are created during the build process.

FileDescriptionNotes
egs5_*egs5 event generation executables 
stdhep_*StdHep tools 
hps-mc-env.shBash setup script 
hps-mc-env.cshCSH setup script 
lcio_dumpeventutility for dumping LCIO event data 
hps-mc-bsubwrapper for submitting LSF jobs 
hps-mc-jsubwrapper for submitting Auger jobs 
hps-mc-workflowwrapper for creating job workflows 


Job Scripts

The project comes with a number of pre-written scripts in the python/jobs dir for running typical HPS MC jobs.

Python scriptDescriptionNotes
ap_job.pyGenerate A-primes using MadGraph4 
beam_job.pyGenerate beam backgrounds using egs5 
dst_job.pyCreate ROOT DST output from recon LCIO files 
egs5_beam_v3_job.pyGenerate beam backgrounds using v3 of egs5 generator 
egs5_beam_v5_job.pyGenerate beam backgrounds using v5 of egs5 generator 
lcio_concat_job.pyConcatenate a number of LCIO files together into a single output file 
lcio_count_job.pyCount the number of events in an LCIO file and throw an error if there are not enough 
tritrig_job.pyGenerate trident events with trigger cuts 
tuple_job.pyCreate ROOT tuple output from one or more input LCIO recon files 
wab_beam_job.pyCreate wab-beam events from WAB and beam inputs and run in SLIC 
wab_beam_tri_job.pyCreate wab-beam-tri events from wab-beam and tritrig inputs and run in SLIC 
wab_job.pyGenerate WAB events in MadGraph4 

Job Scripts

All job scripts follow a specific structure.

First, necessary dependencies are imported.

Code Block
languagepy
from hpsmc.job import Job
from hpsmc.generators import MG4

Next the job is created and the parameters are fetched into a local variable.

Code Block
languagepy
job = Job(name="AP job")
job.initialize()
params = job.params

One or more components should be added to the job, for instance an event generator to create some LHE files.

Code Block
languagepy
# generate A-prime events using Madgraph 4
ap = MG4(name="ap",
        run_card="run_card_"+params.run_params+".dat",
        params={"APMASS": params.apmass},
        outputs=[filename],
        nevents=params.nevents)

Finally, the components should be added to the job and the job should be run.

Code Block
languagepy
job.components = [ap]
job.run()

The specific way that input and output files are used depends on the job script.

Typically, input files are read in without alteration, and (some) scripts can process multiple inputs while some cannot (depending on the particularities of the tools being used).

Output files written to the local "scratch" directory may be based on the name of input files or in some cases will be particular to a given script.

You must know the names of the output files in order to include them as output in your JSON parameters.