Here's a bunch of random tricks I use, some SLAC-related, but many are general to ATLAS.
If you leave a window open too long, your kerberos ticket will expire, so you can't write to your /afs home directory.
Get a new ticket with:
kinit <username>
If you need a particular database release, for running over data typically,
you can set the database release you want to use by adding to your cmthome/requirements file:
set DBRELEASE_OVERRIDE 7.1.1
I still get a "Word too long" message sometimes after setting up an ATLAS release.
It seems to be from the PATH variable getting over a certain length that even bash can't handle.
You can fix it with this, which turns all the /afs/slac.stanford.edu to just /afs/slac, which works just as well:
export PATH=`echo $PATH | sed s%.stanford.edu%%g`
To kill ALL your batch jobs at SLAC:
for j in `bjobs | cut -f 1 -d " "`; do bkill $j; echo $j; done
To run eclipse (see this page):
unset _JAVA_OPTIONS /afs/slac.stanford.edu/g/atlas/work/a/ahaas/eclipse/eclipse
There's a lot more space in /nfs/slac/g/atlas/u01/users:
mkdir /nfs/slac/g/atlas/u01/users/<username> cd; ln -s /nfs/slac/g/atlas/u01/users/<username> nfs2
Do this in a release, and then you can always just grep the packages.txt file to see where things are, or what versions are needed:
cmt show packages > packages.txt
Sometimes a digi job won't work (in 15.3.0?) because "chappy fails" on the input file.
The problem can be fixed by adding the right python directory to your path:
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/afs/slac/g/atlas/b/sw/lcg/external/Python/2.5.4/slc4_ia32_gcc34/lib
One of my favorites, this will do a "fast" build, if you've only changed a src file:
cd cmt; make QUICK=1; cd ..
So useful for joining together lots of ROOT files from many jobs into a single ROOT file:
hadd -h #show how to use hadd -f step.root */*step.root #for instance
Sometimes when running over data it helps to put a link in your running directory:
mkdir sqlite200; ln -s /afs/cern.ch/user/a/atlcond/coolrep/sqlite200/COMP200.db sqlite200/ALLP200.db
Can run these on a POOL file to see what StoreGate keys are in there:
checkFile.py <file> checkSG.py <file>
Can put this in a bash script near the top, to check if you have a GRID cert:
voms-proxy-info if [ $? -eq 1 ] ; then echo You need to get a GRID cert; exit; fi
To get a list of filenames (to load into athena) from a given dataset:
dq2-ls -f -p -H $1 | sed "s%srm://osgserv04.slac.stanford.edu:8443/srm/v2/server?SFN=/xrootd/atlas/%filelist += [\"root://atl-xrdr//atlas/xrootd/%g" | sed "s%$%\"]%g" | grep xrootd
This gets a ROOT file with info on a given data run (mag field configuration, #events, streams, etc.):
#!/bin/bash #gets a ROOT file with info on a run (takes run number as argument) wget http://atlas-runquery.cern.ch/query.py?q=find+run+${1}+%2F+show+all+%2F+nodef sleep 3 wget http://atlas-runquery.cern.ch/data/atlrunquery.root rm -v query.py\?q\=find+run*
There's a few athena options (I like the -s and -c etc.):
athena -h #show athena help
To specify more than one parameter on the input line:
athena -c "DECAY=True; TIMESHIFT=25;" share/jobOptions.pythiaRhad.py
Sometimes my JiveXML files get messed up and can't be read, due to a binary character in the trigger string. Fix it with:
#!/bin/bash for f in JiveXML*; do sed -i '/Obeys/d' $f ; done for f in JiveXML*; do sed -i 's/<trigInfoStreamTag>/<trigInfoStreamTag>fixJive/' $f ; done
If a script is expecting a particular ATLAS release version, you can check it with:
#!/bin/bash if [ $AtlasVersion != "15.3.1" ]; then echo "Go to a cmthome and do . setup.sh -tag=15.3.1"; exit; fi
Check out all the CSC transforms:
csc_<tab> #will show them all... look at csc_atlasG4_trf.py, csc_digi_trf.py, csc_reco_trf.py, etc...
This will actually put your files into the catalog, so you don't get annoying warnings:
pool_insertFileToCatalog <file>
When running on the batch farm, you really should write things out into the /scratch area on the batch node during the job,
and then cp it all back at the end of the job, to prevent hammering on NFS. Here's an example script:
#!/bin/bash . /u/at/ahaas/cmthome/setup.sh -tag=15.3.0 #setup the ATLAS release #make a variable name for the directory which is the number of seconds since 1975 export d=`date +%s`; echo $d #make a scratch area on the local machine mkdir /scratch/ahaas; mkdir /scratch/ahaas/${d}; mkdir /scratch/ahaas/${d}/temp; cd /scratch/ahaas/${d}; pwd; #run your stuff here athena.py /u/at/ahaas/reldirs/15.3.0/Generators/Pythia_i/share/jobOptions.pythiaRhad.py > temp/pyth.log.txt #all outputs of the athena job that are important should get put into the temp directory too... #copy back results in the temp directory to some nfs directory pwd; ls -lh temp export dd=`date +%s` ; echo $dd #this will add the end time of the job to the temp direcory output name if [ -a /nfs/slac/g/atlas/u01/users/ahaas/temp/rh_production_stripped_files/temp_${d}_${dd} ] then echo Destination directory already exists else mv -v /scratch/ahaas/${d}/temp /nfs/slac/g/atlas/u01/users/ahaas/temp/rh_production_stripped_files/temp_${d}_${dd} fi cd; pwd; rm -rfv /scratch/ahaas/${d} echo done
You could run this batch script above (put in a file called myjob.sh) with:
chmod +x myjob.sh #don't forget to make the script executable bsub -q xlong -R rhel40 -J myjobname time myjob.sh
The xlong queue will kill your job after 177.6 hours of CPU time in "SLAC units"... which is about ~15 hours of real CPU time.
See all queues with "bqueues". You can see the the details of a queue with "bqueues -l xlong".
Note the "-R rhel40" above, which forces your job onto a machine compatible with the ATLAS releases (gcc34, RHEL4).
"bhosts -R rhel40" will show you which batch nodes are in that list.
"lsinfo -r" will show you all resourse lists, like the rhel40 one.
Check your batch jobs with "bjobs".
Here's a much simpler "myjob.sh" script you could use for small things (just run it from a directory with the jobOptions.G4Atlas_Sim.py file in it):
#!/bin/bash . ~/cmthome/setup.sh -tag=15.3.0 athena.py jobOptions.G4Atlas_Sim.py > athena_sim.out.txt
Don't forget to set random seeds when you're generating MC with custom scripts! This python code helps set some:
import random random.seed() R1=random.randint(0,100000000) R2=random.randint(0,100000000) R3=random.randint(0,100000000) R4=random.randint(0,100000000) PYTHR = "PYTHIA "+str(R1)+" "+str(R2) PYTHRI = "PYTHIA_INIT "+str(R3)+" "+str(R4) print PYTHR print PYTHRI
This will print out the MC truth:
from AthenaCommon.AlgSequence import AlgSequence job=AlgSequence() from TruthExamples.TruthExamplesConf import PrintMC PrintMC.PrintStyle = "Vertex" job += PrintMC()
To transfer large files from CERN to SLAC (or SLAC to CERN, etc.), use bbcp. You can get upto ~1GB/min!
This page descibes its usage.