Best Practices for Using the SLAC Batch System

Version of 76/298/20142015

Contents:

Table of Contents

minLevel	2

...

Shared Resource	Use	Notes
batch machines	LSF general queues	fell, hequ, kiso, dole, bullet
interactive login machines	light load, short-running, development/test	rhel6-64, rhel5-64centos7
NFS disks	site-wide, medium performance	Fermi group and user space, in general, NOT backed up up
AFS disks	global, medium performance	$HOME directories for all users, these areas are backed up
xroot disks	site-wide, high performance	Fermi storage for bulk data read by everyone write restricted to pipeline pipeline and science groups
network facilities		switches & fabric

It is your responsibility to asses the impact of any significant computing project you wish to run: to ensure it will not unduly stress the system or make it unusable for other users. Such an assessment may start with running successively larger numbers of jobs while carefully monitoring the impact on key servers. In addition, there are some known problems that one must take care to avoid from ever happening. This document attempts to provide some hints on preparing your batch jobs and assessing their impact.

...

Tip

title	Start faster

Please also see this page to learn how to get your batch jobs to start running sooner.

...

Known problems to avoid

PFILE (and other) Simultaneous File Writing Conflicts

Parameter files, "PFILES", are used by the Fermi ScienceTools, and FTools to store default and last-used options for the commands within these packages. Normally, these small files are stored in $HOME/pfiles and are written each time a command is invoked. If multiple jobs attempt to access these files simultaneously, an unfortunate and painful conflict will result. Not only will your jobs fail to give reliable results, but this sort of activity is very demanding on file servers and can cause severely degraded performance for all users.

Warning
This problem may also occur with other files, typically to "dot file" or a "dot directory" in your $HOME directory. Therefore, it is good practice to redefine $HOME to a non-shared scratch directory for all projects requiring multiple, simultaneous batch jobs.

...

Therefore, PFILES should be written to directories which are unique for each job, e.g.,

...

One can sometimes get a traceback from a truncated core dump file, but not a whole lot more, so the favored approach is to disable the core file completely.

...

Minimizing stress on file servers

...

The number of simultaneous jobs that may be run without causing severe stress will vary depending upon exactly what the jobs are doing. For example, some jobs perform heavy I/O at the beginning and end, while others perform I/O continuously. Every job is a bit different and so requires its own assessment.

General Guidelines for Using Remote File Servers

Info
Any large task of more than a few 10's of batch jobs must be ramped up slowly in order to allow for monitoring the relevant servers for adverse impact. Additionally, some tasks may require you to trickle in jobs rather than submitting them as a large batch to prevent overloading of, for example, the Gleam and/or ScienceTools code server.

General Guidelines for Using Remote File Servers

The most basic rule is to avoid prolonged I/O to a remote file server. (The one exception is xroot which seems able to The most basic rule is to avoid prolonged I/O to a remote file server. (The one exception is xroot which seems able to handle very large loads of this type.) This includes both file reading & writing as well as directory operations, such as creating, opening, closing, deleting files. A good way to design your job is to copy needed input files to local scratch space for reading, and to write output data products to local scratch, then copy to a remote file system at job completion.

...

operation	local scratch	xroot	NFS	AFS	Notes
writing large files (>100MB)
reading large files (>100MB)
writing small files (<100MB)					okay only in small numbers (NFS/AFS)
reading small files (<100MB)
copying whole files (local<->remote)					typically at job start and/or end
frequently creating/opening/closing/deleting files					best to avoid this completely
frequently stat'ing files
multiple jobs writing to the same file					don't do this!

...

xroot is the repository for Fermi on-orbit data and Monte Carlo data. It is readable by anyone, but typically not writable except by pipeline accounts.
NFS refers to the collection of servers dedicated for Fermi use. Typically one server (machine) has multiple disks attached, so that stressing a server can cause a problem for multiple groups of users.
AFS is the filesystem used for user $HOME directories and a relatively small collection of Fermi group areas. This is a world accessible file system (if one has proper access credentials) and caches file on local machines.

...

Finally, keep track of how much space is available on the directories/partitions you write to. Writing to a 100% full partition is known to cause a lot of stress on the file server. It is easy to check the available space: cd to the directory of interest and then issue the "df -h ." command, which will tell you the size and remaining space on that partition. (Less frequently, one may encounter a different limit: inodes. Check your inode quota with the "df -hi ." command. This quota runs against the sum: number of files + directories + symlinks)

Local Scratch Space

Local scratch directories are available on all SLAC linux machines. They vary in size from several GB to several 100 GB. This space is shared by all users logged into a given host. On batch machines, it is vitally important to clean up this space at the end of your job or it will, over time, fill up (and this has happened). Common practice for using scratch space is to create a directory with your username and put all files in there. Note that if using the Fermi pipeline to manage your job flow, you will need to devise a 'stream-dependent' method of naming your scratch sub-directories to prevent jobs running on the same host from overwriting each other's files.

...

Finally, note that all linux machines have a /tmp disk partition. It is strongly recommended that /tmp NOT be used because of the danger of its becoming full which will cause the machine to crash.

Monitoring remote file servers

First, one must identify the server holding all of the job's needed input and future output files.

will cause the machine to crash.

Monitoring remote file servers

First, one must identify the server holding all of the job's needed input and future output files.

'cd' to the location holding the existing or future file
'df .' this give tell you one of three answers:
1. AFS
  Code Block
  Filesystem 1K-blocks Used Available Use% Mounted on AFS 9000000 0 9000000 0% /afs
  This tells you it is an AFS server. Then, follow up with the command 'fs whereis .', e.g.,
  Code Block
  File . is on host afs03.slac.stanford.edu
  The server is afs03
2. old NFS (single wain-class server)
  Code Block
  Filesystem
'cd' to the location holding the existing or future file

'df .' this give tell you one of three answers:

AFS

Code Block
Filesystem 1K-blocks Used Available Use% Mounted on AFS wain025:/g.glast.u55/dragon 9000000 0 9000000 10485760 3731456 0% /afs

This tells you it is an AFS server. Then, follow up with the command 'fs whereis .', e.g.,

Code Block
File . is on host afs03.slac.stanford.edu

The server is afs03

 6754304  36% /nfs/farm/g/glast/u55/dragon

This is an NFS location and the server is wain025

new NFS (GPFS as underlying file system with clusteredNFS 'heads', as of May 2015)

Code Block

title	GPFS

NFS

code

Filesystem           1K-blocks   Size  Used AvailableAvail Use% Mounted on
wain025:/g.glast.u55/dragonfermi-cnfslb1:/gpfs/slac/fermi/fs2/u
                      10485760 373145610G   67543043.6G  6.5G  36% /nfs/farm/g/glast/u55/dragonu

This is also an NFS location, but one must monitor four different servers: fermi-gpfs03, fermi-gpfs04, fermi-cnfs01, fermi-cnfs02This is an NFS location and the server is wain025

local

Code Block
Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda1 18145092 8530464 8692904 50% /

This is a local disk – and will not be visible to any batch jobs

...

Store analysis code and scripts in your AFS home directories (which are backed up)
Assessment. For every new task, assess its impact on key servers to ensure they will not be overloaded
File staging. Files that remain open for the duration of the job (either reading or writing) should be located in local scratch space. Copy needed input files to local scratch at the beginning of your job; write output data products to their final destinations at the end of the job.
Submitting jobs.
- Never submit a large number (~>50) jobs without first assessing their impact on key shared resources.
- If your jobs are known to produce a large I/O load only during the start-up phase, then submit jobs in small batches, wait for those to run and pass the start-up phase and only then submit another small batch, etc.
- If you are planning a large batch operation of, say, more than 50 simultaneous jobs, please inform and coordinate with SAS management (Richard Dubois).
PFILES. Arrange that the parameter files for ScienceTools, FTools, etc. be stored in a directory unique to the batch job.
Avoid Disk Thrashing
- Completely
Core dumps. Completely
- disable core dumps
- Avoid unnecessary file open() and close() operations, as well as file creates/deletes.
- Avoid writing to a full disk partition.
Cleanup. Be sure to perform a cleanup on the local scratch space after your jobs have completed!

Space shortcuts

Child pages

Versions Compared

Old Version 8

New Version Current

Key

Best Practices for Using the SLAC Batch System

Known problems to avoid

PFILE (and other) Simultaneous File Writing Conflicts

Minimizing stress on file servers

General Guidelines for Using Remote File Servers

General Guidelines for Using Remote File Servers

Local Scratch Space

Monitoring remote file servers

Monitoring remote file servers

Space shortcuts

Child pages

Page History

Versions Compared

Old Version 8

New Version Current

Key

Best Practices for Using the SLAC Batch System

Known problems to avoid

PFILE (and other) Simultaneous File Writing Conflicts

Minimizing stress on file servers

General Guidelines for Using Remote File Servers

General Guidelines for Using Remote File Servers

Local Scratch Space

Monitoring remote file servers

Monitoring remote file servers