Introduction
Currently under construction
This page is meant as an introduction to running an org.lcsim analysis (in this case the reconstruction) on the grid. Please see on the bottom of the page for further documentation of grid concepts.
Preliminaries
This page assumes that you belong to the ilc Virtual Organization (VO). Please follow instructions How do I use the LCG grid if this is not the case. Furthermore, you should have a self-contained distribution of org.lcsim available. In summary, you need
- grid certificates for the ilc vo
- a self-contained version of the lcsim libraries, i.e. the zip file that is produced by the maven2
install
target - a tar file containing a version of Sun's jre >= 1.5
Setup
The grid only accepts jobs that are self-contained. That means you have to either be sure that the software you require to run is readily installed at the site you submit your job to, or you provide all executables to the job.
You can request that your job be submitted to a site that has certain software installed by adding a Requirements
line to your job options.
Unfortunately there is no standardized way how the different groups advertise that a certain suite is installed at a site. Contact software experts of your group find out how to make sure that your jobs are submitted to a site that meets your requirements.
We now have to create a self-contained piece of software capable of running your Analysis code. It consists of the following pieces
File |
Description |
---|---|
A file to setup the event loop, load the necessary drivers and input files, write the output |
|
lcsim-libs-1.5-SNAPSHOT.zip |
The distribution of the lcsim libraries as obtained by the mvn |
filename.jdl |
An example of the job options that control the submission of your job to the grid |
executable.sh |
An example of the shell script that sets up the environment for your job |
Executable = "lcsimReco_LOI_Higgs.sh"; Arguments = "inputFiles.slcio"; StdOutput = "std.out"; StdError = "std.err"; OutputSandbox = {"std.out", "std.err"}; InputSandbox = {"/home/hep/mbq94921/LcSimReconstruction/MainLoop.class", "/home/hep/mbq94921/LcSimReconstruction/lcsimReco_LOI_Higgs.sh"}; Requirements = RegExp(".gridpp.rl.ac.uk", other.GlueCEUniqueId);
JDL option |
Explanation |
---|---|
|
This is the program being executed. In our case we execute a shell script that downloads the necessary prerequisites and sets up the environment for the job, then executes the java program that defines our analysis |
|
The arguments to be passed to the executable. In our case we provide the name of the input file to the shell script |
|
The default output of our job is being redirected to a file whose name is specified here |
|
The error messages that the job produces are being redirected to a file whose name is specified here |
|
Files that should be transferred from the machine where the job is executed are listed here. We are only interested in the textual output of our jobs. The results of the analysis are copied to grid storage by the shell script |
|
Files that have to be transferred from the local machine to the grid Computing Element (CE) are specified here. The combined size of the files in the InputSandbox should not exceed 10 MB. Otherwise expect complications with the submission of your jobs. |
|
Put requirements on the site that your jobs are being submitted to here. See the glite User Guide for details |
#!/bin/sh #### # Steering file for the job # this is submitted to the grid. The command line argument comes from the jdl file # that way, I only have to generate one script steered from many jdl files #### # $1 is the first command line argument, in this case the input file lcg-cp srm://heplnx204.pp.rl.ac.uk/pnfs/pp.rl.ac.uk/data/ilc/sidLOI/250/LOI_higgs/$1 file:$1 # we also have to download the jre, because it's too big for the input sandbox lcg-cp srm://heplnx204.pp.rl.ac.uk/pnfs/pp.rl.ac.uk/data/ilc/apps/jre.tar.gz file:jre.tar.gz # the dependent libraries are also too big for the input sandbox lcg-cp srm://heplnx204.pp.rl.ac.uk/pnfs/pp.rl.ac.uk/data/ilc/apps/lcsim-1.5-SNAPSHOT-deps.zip lcsim-1.5-SNAPSHOT-deps.zip unzip lcsim-1.5-SNAPSHOT-deps.zip for jar in lib/*.jar do CLASSPATH=${jar}:${CLASSPATH} done CLASSPATH=.:${CLASSPATH} export CLASSPATH tar -xzf jre.tar.gz export PATH=jre/bin:$PATH export JAVA_HOME=jre java MainLoop $1 lcg-cp file:output.slcio srm://heplnx204.pp.rl.ac.uk/pnfs/pp.rl.ac.uk/data/ilc/sidLOI/250/LOI_higgs/reco/$1 rm $1 rm output.slcio
Preparation
You now have to generate your jobs. Each job is defined by a .jdl
file. Ideally, only the Arguments
line changes for each input file.
Naming the jdl file the same as the input file and just changing the extension simplifies the bookkeeping tremendously.
When the jdl files are produced using your favourite tools, they can be submitted to the grid.
Use glite-wms-job-list-match -a gridTest.jdl
to list the machines where the code could run. (-a
is to automatically delegate the task) This helps spot errors early
Use the glite-wms-job* tools for the submission of jobs. They should be preferred over the older glite-job* tools. In order to submit one job, execute
glite-wms-job-submit -a -o job.id file.jdl
Option |
Description |
---|---|
-a |
automatic delegation of the job |
-o job.id |
store the id of the job in the file job.id. This can later be used to query the job status and the job log |
file.jdl |
The file that contains your job options |
For submitting more than one file, use the --collection
option to glite-wms-job-submit
. The argument to this option is a directory that contains all job option files (*.jdl
that you want to submit. This is much faster than submitting the jobs individually.
Notes
- Use the lcg tools for data management. They are high-level and have man pages.
- lcg-cp copies directly, without storing the file in a catalog. This means logical filenames are not available.
- The grid setup works both at RAL and at SLAC (login to rhel4-32).
http://wiki.egee-see.org/index.php/SG_Running_Jobs_WMProxy_CLI
http://wiki.egee-see.org/index.php/SG_Data_Management_High_Level_Tools
https://edms.cern.ch/file/722398//gLite-3-UserGuide.html#SECTION00090000000000000000
http://ilcsoft.desy.de/portal/e279/e555/infoboxContent556/2008-09-DESY-grid-installations.pdf
Tips and Tricks
Creating the job files
The following is an easy way to generate lots of input files in the least messy way, using python.
- Put all your input files into a text file, e.g.
ls /directory/containing/input/files/ > listOfInputFiles.txt
- Create a template jdl with a placeholder for the input file. This example is as simple as possible
'template.jdl'
Executable = lcsimReco_LOI_Higgs.sh # %s is going to be replaced with the input filename by the python script. When the script is submitted, the file name is then picked up as $1 by the shell script. Arguments = %s StdOutput = "std.out"; StdError = "std.err"; OutputSandbox = {"std.out", "std.err"}; InputSandbox = {"/home/hep/mbq94921/LcSimReconstruction/MainLoop.class", "/home/hep/mbq94921/LcSimReconstruction/lcsimReco_LOI_Higgs.sh"}; Requirements = RegExp(".gridpp.rl.ac.uk", other.GlueCEUniqueId);
- Use the following script to generate one jdl per line in
listOfInputFiles.txt
createInputs.py#!/usr/bin/env python import os.path as path import os #read in the template template = open('template.jdl').read() #create a subdir for cleanliness os.mkdir('jobfiles') os.chdir('jobfiles') # read the input files for input in open('listOfInputFiles.txt'): filename = path.basename(input) name, extension = path.splitext(filename) output = open(name + '.jdl', 'w') output.write(template % input) output.close()
- Execute the script with
python createInputs.py
. It will create a directory jobfiles that contains one jdl for each input file. - Download and compile the
MainLoop.java
- Then submit the jobs with
voms-proxy-init -voms ilc glite-wms-job-submit -a -o collection.jodID --collection jobfiles