Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
  • Overview
  • Prerequisites
  • Setting up the Environment
  • Submitting the first Example Jobs
  • Submitting a Job running SLIC
  • Running Commands directly on the Head Node
  • Checking and Killing your Jobs

Overview

This is an example to run jobs and especially  SLIC , the Simulator for the Linear Collider, on the FermiGrid which is part of the Open Science Grid. SLIC is a Geant4-based simulations package that uses an XML geometry input format called LCDD to describe geometry, sensitive detectors and readout geometry. 

Note! The examples are written in a way that one can just cut and paste from your browser window to the terminal session on ILCSIM. Do not cut and paste into an editor (It wouldn't work unless you remove a lot of  back slashes: \ (sad) ). Instead you can copy and edit the files that are created when cutting and pasting to the terminal window and modify them to to fit your own needs.

Prerequisites

...

  1. get a DOE grid certificate from http://security.fnal.gov/pki/Get-Personal-DOEGrids-Cert.html
    This page also explains how to export the certificate from the browser and how to deal with directory permissions and such.
  2. register to the ILC VO (Virtual organization) at http://cd-amr.fnal.gov/ilc/ilcsim/ilcvo-registration.shtml that will guide you to:
    https://voms.fnal.gov:8443/vomrs/ilc/vomrs
  3. Everything is set up  on ILCSIM. So to try things out it is recommended to get an account on ILCSIM using the following form
    http://cd-amr.fnal.gov/ilc/ilcsim/ilcsim.shtml

...

No Format
voms-proxy-info -all

Submitting the first

...

Example Jobs

Now you should be all setup to submit a first trivial test job just to make sure that everything is working.  Just cut and paste the following lines into your terminal window. This will submit a grid job which starts 5 separate processes. The processes will not do anything exciting but execute sleep for 10 seconds  before  they terminate. Since no output is created the sleep_grid.out.$(Cluster).$(Process) and sleep_grid.err.$(Cluster).$(Process) should be empty.
(Note!: $(Cluster) represents the job number and $(Process) represents the (5) process  numbers)
The condor log files are:   sleep_grid.log.\$(Cluster).\$(Process)

...

No Format
rm -f slic_grid.csh
cat > slic_grid.csh << +EOF
#!/bin/csh
echo start
/bin/date
setenv LABELRUN slic_grid-\${ClusterProcess}
setenv TARFILE \${LABELRUN}-results.tar
echo \${TARFILE}
echo start
/bin/date
mkdir results
/grid/app/ilc/detector/SimDist/Oct-31-2007/SimDist/scripts/slic.sh -r 5 \
-g /grid/app/ilc/detector/SimDist/detectors/sid01/sid01.lcdd            \
-i /grid/data/ilc/detector/LDC/stdhep/ZZ_run10.stdhep -o ./results/ZZ_run10\${LABELRUN} >& \
./results/ZZ_run10\${LABELRUN}.lis
ls -lh results
/bin/date
echo "build output tarball: " \${TARFILE}
tar -cf \${TARFILE} results
echo done
+EOF
chmod +x slic_grid.csh

rm -f slic_grid.run
cat > slic_grid.run << +EOF
universe = grid
type = gt2
globusscheduler = fngp-osg.fnal.gov/jobmanager-condor
executable = ./slic_grid.csh
transfer_output = true
transfer_error = true
transfer_executable = true
environment = "ClusterProcess=\$(Cluster)-\$(Process)"
transfer_output_files = slic_grid-\$(Cluster)-\$(Process)-results.tar
log = slic_grid.log.\$(Cluster).\$(Process)
notification = NEVER
output = slic_grid.out.\$(Cluster).\$(Process)
error = slic_grid.err.\$(Cluster).\$(Process)
stream_output = false
stream_error = false
ShouldTransferFiles = YES
WhenToTransferOutput = ON_EXIT
globusrsl = (jobtype=single)(maxwalltime=999)
queue
+EOF

condor_submit slic_grid.run

Running

...

Commands directly on the

...

Head Node

To run some commands directly on the grid head nodes use a syntax like this:

...

No Format
globus-job-run fngp-osg.fnal.gov/jobmanager-condor /bin/ls /grid/app/ilc/detector/SimDist/

Checking and

...

Killing your jobs

You can see the status of all jobs using the following command:

...