Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Now you should be able to connect to the ilcsim node.

No Format
ssh ilcsim

Grid Tools Setup and Authentication at Fermilab

To set up the environment and to get the necessary grid proxy certificate, issue the following commands on the ILCSIM machine (assumes a bash shell).

No Format

source /fnal/ups/grid/setup.sh
voms-proxy-init -voms ilc:/ilc/detector
# give passwd etc.

To check the status of the proxy:

No Format

voms-proxy-info -all

Grid Authentication from an External Site Outside Fermilab

...

Info

In order to submit jobs to the Fermilab batch system, you will need to run a local copy of Condor, including the job scheduler. Talk to your site administrator about setting up this software, which can be configured as part of the VDT.

Setting up the Environment at Fermilab

To set up the environment and to get the necessary grid proxy certificate, log onto ILCSIM and issue the following commands:

No Format

source /fnal/ups/grid/setup.sh
voms-proxy-init -voms ilc:/ilc/detector
# give passwd etc.

To check the status of the proxy:

...

Example Grid Jobs

Submitting the First Example Jobs

Now you should be all setup to submit a test job to make sure that everything is working. Cut and paste the following lines into your terminal window. This will submit a grid job which starts 5 separate processes. The processes will just execute sleep for 10 seconds before terminating. Since no output is created the sleep_grid.out.$(Cluster).$(Process) and sleep_grid.err.$(Cluster).$(Process) files should be empty.

...

No Format
rm -f env_grid.sh
cat > env_grid.sh << +EOF
#!/bin/sh -f
printenv
cd \${_CONDOR_SCRATCH_DIR}
pwd
#
# This sets up the environment for osg in case we want to
# use grid services like srmcp
#
. $OSG_GRID/setup.sh
source \${VDT_LOCATION}/setup.sh
printenv
/bin/df
+EOF
chmod +x env_grid.sh

rm -f env_grid.run
cat > env_grid.run << +EOF
universe = grid
type = gt2
globusscheduler = fngp-osg.fnal.gov/jobmanager-condor
executable = ./env_grid.sh
transfer_output = true
transfer_error = true
transfer_executable = true
log = env_grid.log.\$(Cluster).\$(Process)
notification = NEVER
output = env_grid.out.\$(Cluster).\$(Process)
error = env_grid.err.\$(Cluster).\$(Process)
stream_output = false
stream_error = false
ShouldTransferFiles = YES
WhenToTransferOutput = ON_EXIT
globusrsl = (jobtype=single)(maxwalltime=999)
queue
+EOF

condor_submit env_grid.run

Submitting a Job running SLIC

Now finally let's run SLIC (smile) . We will use the SLIC installation and a data set that are available on the GRID worker nodes. As in the previous examples cut and paste the contends below:

No Format
rm -f slic_grid.csh
cat > slic_grid.csh << +EOF
#!/bin/csh
echo start
/bin/date
setenv LABELRUN slic_grid-\${ClusterProcess}
setenv TARFILE \${LABELRUN}-results.tar
echo \${TARFILE}
echo start
/bin/date
mkdir results
/grid/app/ilc/detector/SimDist/Oct-31-2007/SimDist/scripts/slic.sh -r 5 \
-g /grid/app/ilc/detector/SimDist/detectors/sid01/sid01.lcdd            \
-i /grid/data/ilc/detector/LDC/stdhep/ZZ_run10.stdhep -o ./results/ZZ_run10\${LABELRUN} >& \
./results/ZZ_run10\${LABELRUN}.lis
ls -lh results
/bin/date
echo "build output tarball: " \${TARFILE}
tar -cf \${TARFILE} results
echo done
+EOF
chmod +x slic_grid.csh

rm -f slic_grid.run
cat > slic_grid.run << +EOF
universe = grid
type = gt2
globusscheduler = fngp-osg.fnal.gov/jobmanager-condor
executable = ./slic_grid.csh
transfer_output = true
transfer_error = true
transfer_executable = true
environment = "ClusterProcess=\$(Cluster)-\$(Process)"
transfer_output_files = slic_grid-\$(Cluster)-\$(Process)-results.tar
log = slic_grid.log.\$(Cluster).\$(Process)
notification = NEVER
output = slic_grid.out.\$(Cluster).\$(Process)
error = slic_grid.err.\$(Cluster).\$(Process)
stream_output = false
stream_error = false
ShouldTransferFiles = YES
WhenToTransferOutput = ON_EXIT
globusrsl = (jobtype=single)(maxwalltime=999)
queue
+EOF

condor_submit slic_grid.run

Running Commands directly on the Head Node

To run some commands directly on the grid head nodes use a syntax like this:

...

No Format
globus-job-run fngp-osg.fnal.gov/jobmanager-condor /bin/ls /grid/app/ilc/detector/SimDist/

Checking and Killing your Jobs

You can see the status of all jobs using the following command:

...