ATLAS computing tricks

Here's a bunch of random tricks I use, some SLAC-related, but many are general to ATLAS.

If you leave a window open too long, your kerberos ticket will expire, so you can't write to your /afs home directory.
Get a new ticket with:

kinit <username>

If you need a particular database release, for running over data typically,
you can set the database release you want to use by adding to your cmthome/requirements file:

 set DBRELEASE_OVERRIDE 7.1.1

I still get a "Word too long" message sometimes after setting up an ATLAS release.
It seems to be from the PATH variable getting over a certain length that even bash can't handle.
You can fix it with this, which turns all the /afs/slac.stanford.edu to just /afs/slac, which works just as well:

export PATH=`echo $PATH | sed s%.stanford.edu%%g`

To kill ALL your batch jobs at SLAC:

for j in `bjobs | cut -f 1 -d " "`; do bkill $j; echo $j; done

To run eclipse (see this page):

unset _JAVA_OPTIONS
/afs/slac.stanford.edu/g/atlas/work/a/ahaas/eclipse/eclipse

There's a lot more space in /nfs/slac/g/atlas/u01/users:

mkdir /nfs/slac/g/atlas/u01/users/<username>
cd; ln -s /nfs/slac/g/atlas/u01/users/<username> nfs2

Do this in a release, and then you can always just grep the packages.txt file to see where things are, or what versions are needed:

cmt show packages > packages.txt

Sometimes a digi job won't work (in 15.3.0?) because "chappy fails" on the input file.
The problem can be fixed by adding the right python directory to your path:

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/afs/slac/g/atlas/b/sw/lcg/external/Python/2.5.4/slc4_ia32_gcc34/lib

One of my favorites, this will do a "fast" build, if you've only changed a src file:

cd cmt; make QUICK=1; cd ..

So useful for joining together lots of ROOT files from many jobs into a single ROOT file:

hadd -h #show how to use
hadd -f step.root */*step.root #for instance

Sometimes when running over data it helps to put a link in your running directory:

mkdir sqlite200; ln -s /afs/cern.ch/user/a/atlcond/coolrep/sqlite200/COMP200.db sqlite200/ALLP200.db

Can run these on a POOL file to see what StoreGate keys are in there:

checkFile.py <file>
checkSG.py <file>

Can put this in a bash script near the top, to check if you have a GRID cert:

voms-proxy-info
if [ $? -eq 1 ] ; then echo You need to get a GRID cert; exit; fi

To get a list of filenames (to load into athena) from a given dataset:

dq2-ls -f -p -H $1 | sed "s%srm://osgserv04.slac.stanford.edu:8443/srm/v2/server?SFN=/xrootd/atlas/%filelist += [\"root://atl-xrdr//atlas/xrootd/%g" | sed "s%$%\"]%g" | grep xrootd

This gets a ROOT file with info on a given data run (mag field configuration, #events, streams, etc.):

#!/bin/bash
#gets a ROOT file with info on a run (takes run number as argument)
wget http://atlas-runquery.cern.ch/query.py?q=find+run+${1}+%2F+show+all+%2F+nodef
sleep 3
wget http://atlas-runquery.cern.ch/data/atlrunquery.root
rm -v query.py\?q\=find+run*

There's a few athena options (I like the -s and -c etc.):

athena -h #show athena help

To specify more than one parameter on the input line:

athena -c "DECAY=True; TIMESHIFT=25;" share/jobOptions.pythiaRhad.py

Sometimes my JiveXML files get messed up and can't be read, due to a binary character in the trigger string. Fix it with:

#!/bin/bash
for f in JiveXML*; do sed -i '/Obeys/d' $f ; done
for f in JiveXML*; do sed -i 's/<trigInfoStreamTag>/<trigInfoStreamTag>fixJive/' $f ; done

If a script is expecting a particular ATLAS release version, you can check it with:

#!/bin/bash
if [ $AtlasVersion != "15.3.1" ]; then echo "Go to a cmthome and do . setup.sh -tag=15.3.1"; exit; fi

Check out all the CSC transforms:

csc_<tab> #will show them all... look at csc_atlasG4_trf.py, csc_digi_trf.py, csc_reco_trf.py, etc...

This will actually put your files into the catalog, so you don't get annoying warnings:

pool_insertFileToCatalog <file>

When running on the batch farm, you really should write things out into the /scratch area on the batch node during the job,
and then cp it all back at the end of the job, to prevent hammering on NFS. Here's an example script:

#!/bin/bash

. /u/at/ahaas/cmthome/setup.sh -tag=15.3.0 #setup the ATLAS release

#make a variable name for the directory which is the number of seconds since 1975
export d=`date +%s`; echo $d

#make a scratch area on the local machine
mkdir /scratch/ahaas; mkdir /scratch/ahaas/${d}; mkdir /scratch/ahaas/${d}/temp; cd /scratch/ahaas/${d}; pwd;

#run your stuff here
athena.py /u/at/ahaas/reldirs/15.3.0/Generators/Pythia_i/share/jobOptions.pythiaRhad.py >  temp/pyth.log.txt
#all outputs of the athena job that are important should get put into the temp directory too...

#copy back results in the temp directory to some nfs directory
pwd; ls -lh temp
export dd=`date +%s` ; echo $dd #this will add the end time of the job to the temp direcory output name
if [ -a /nfs/slac/g/atlas/u01/users/ahaas/temp/rh_production_stripped_files/temp_${d}_${dd} ]
 then echo Destination directory already exists
 else mv -v /scratch/ahaas/${d}/temp /nfs/slac/g/atlas/u01/users/ahaas/temp/rh_production_stripped_files/temp_${d}_${dd}
fi
cd; pwd; rm -rfv /scratch/ahaas/${d}
echo done

You could run this batch script above (put in a file called myjob.sh) with:

chmod +x myjob.sh #don't forget to make the script executable
bsub -q xlong -R rhel40 -J myjobname time myjob.sh

The xlong queue will kill your job after 177.6 hours of CPU time in "SLAC units"... which is about ~15 hours of real CPU time.
See all queues with "bqueues". You can see the the details of a queue with "bqueues -l xlong".
Note the "-R rhel40" above, which forces your job onto a machine compatible with the ATLAS releases (gcc34, RHEL4).
"bhosts -R rhel40" will show you which batch nodes are in that list.
"lsinfo -r" will show you all resourse lists, like the rhel40 one.
Check your batch jobs with "bjobs".

Here's a much simpler "myjob.sh" script you could use for small things (just run it from a directory with the jobOptions.G4Atlas_Sim.py file in it):

#!/bin/bash
. ~/cmthome/setup.sh -tag=15.3.0
athena.py jobOptions.G4Atlas_Sim.py > athena_sim.out.txt

Don't forget to set random seeds when you're generating MC with custom scripts! This python code helps set some:

import random
random.seed()
R1=random.randint(0,100000000)
R2=random.randint(0,100000000)
R3=random.randint(0,100000000)
R4=random.randint(0,100000000)
PYTHR = "PYTHIA "+str(R1)+" "+str(R2)
PYTHRI = "PYTHIA_INIT "+str(R3)+" "+str(R4)
print PYTHR
print PYTHRI

This will print out the MC truth:

from AthenaCommon.AlgSequence import AlgSequence
job=AlgSequence()
from TruthExamples.TruthExamplesConf import PrintMC
PrintMC.PrintStyle = "Vertex"
job += PrintMC()

To transfer large files from CERN to SLAC (or SLAC to CERN, etc.), use bbcp. You can get upto ~1GB/min!
This page descibes its usage.

Space shortcuts

Child pages