You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

SLAC use IBM LSF (Load Sharing Facility) batch system. Please refer to the LSF document to get familiar with the basic usage of LSF.

LSF resource available to SLAC ATLAS users:

SLAC ATLAS users have their own dedicate LSF queue and resource. They can also use the "general fairshare" queues available to everyone at SLAC.

Dedicated LSF resource for ATLAS users

SLAC ATLAS users can run jobs in a dedicated LSF queue "atlas-t3". The following command show who can use the dedicate LSF resource, and who can add/remove users to the dedicated resource.

$ ypgroup exam -group atlas-t3
Group 'atlas-t3':
	GID:     3104
	Comment: 
	Last modified at Oct 14 00:22:52 2015 by yangw
	Owners:  sch, sudong, young, zengq 
	Members: acukierm, bpn7, laurenat, makagan, osgatlas01, rubbo, zengq, zihaoj

	This is a secondary group.

The above shows the UNIX group "atlas-t3". People in the "Owners" line and add/remove members of this group. People in the "Member" line can run jobs in the dedicate queue. (Owners are not members).

The following is an example job script for users to submit jobs to the atlas-t3 queue:

$ cat job-script.sh 
#!/bin/sh
# run in LSF queue atlas-t3 and run up to 120 minutes (wall time)
#BSUB -q atlas-t3
#BSUB -W 120


cd /scratch
myworkdir=/scratch/`uname -u`$$
# create a work dir on batch node's /scratch space
mkdir $myworkdir
cd $myworkdir
# run payload
...run my task here...  &
wait  # wait for the task to finish 
# save the output to storage, use either "cp" to copy to NFS spaces, or "xrdcp" to copy to the xrootd spaces
cp myoutput_file /nfs/slac/g/atlas/u02/myoutput_file  
xrdcp myoutput_file root://atlprf01:11094//atlas/local/myoutput_file
# clean up
cd ..
rm -rf $myworkdir

$ bsub < job-script.sh  # submit the job

In the above script, the #BSUB directive tells LSF that the batch queue is "atlas-t3" and the wall time limit is 120 minutes. Please always specify a wall time. Otherwise, your job will be killed if they exist the default 30 minutes wall time limit.

 

  • No labels