Best Practices for Using the SLAC Batch System
Version of 78/2927/2014
Contents:
Table of Contents minLevel 2
...
- Store analysis code and scripts in your AFS home directories (which are backed up)
- Assessment. For every new task, assess its impact on key servers to ensure they will not be overloaded
- File staging. Files that remain open for the duration of the job (either reading or writing) should be located in local scratch space. Copy needed input files to local scratch at the beginning of your job; write output data products to their final destinations at the end of the job.
- Submitting jobs.
- Never submit a large number (~>50) jobs without first assessing their impact on key shared resources.
- If your jobs are known to produce a large I/O load only during the start-up phase, then submit jobs in small batches, wait for those to run and pass the start-up phase and only then submit another small batch, etc.
- If you are planning a large batch operation of, say, more than 50 simultaneous jobs, please inform and coordinate with SAS management (Richard Dubois).
- PFILES. Arrange that the parameter files for ScienceTools, FTools, etc. be stored in a directory unique to the batch job.
- Avoid Disk Thrashing
- Completely
- disable core dumps
- Avoid unnecessary file open() and close() operations
- Avoid writing to a full disk partition.
- Cleanup. Be sure to perform a cleanup on the local scratch space after your jobs have completed!