Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Note

By default, all users when they first use slurm will have access to the shared Account on the shared Partition with scavenger QoS.

If you belong to a group that has contributed hardware into the SDF, you will be eligible to use different Accounts and Partitions:

  • We are testing the ability for your group/team Slurm Administrator to have the ability to add users to their Accounts (delegated administration). If you wish to represent your group/team to do this, please contact us!
  • We will need to know which slurm Account to 'bill' you against (don't worry, there will be no $ charge for usage, it's purely for accounting and reporting). This Account will most likely be your immediate group/team that you work with. Please send your unix username and your group/team name to unix-admin@slac.stanford.edu.
Note

We do NOT, and WILL NOT support AFS tokens with slurm. This will cause your jobs to fail if you try to write to anywhere under /afs (including your currently home ~ directories). We shall be deploying new storage in the near future, with dedicated home and data directories. In the meantime, It is recommended to use GPFS space if your group currently has any.

 

Why should I use Batch?

Whilst your desktop computer and or laptop computer has a fast processor and quick local access to data stored on its hard disk/ssd; you may want to run either very big and/or very large compute tasks that may require a lot of CPUs, GPUs, memory, or a lot of data. Our compute servers that are part of the Batch system allows your to do this. Our servers typically also have very fast access to centralised storage, have (some) common software already preinstalled, and will enable you to run these long tasks without impacting your local desktop/laptop resources.

...

How can I get an Interactive Terminal?

 

Warning

We are experiencing problems with the shared partition with interactive jobs: srun will claim that it's waiting for resources, but in fact the allocation fails immediately. we have yet to experience the same issue with other partitions. Your batch jobs should continue to function correctly, however.


use use the srun command

Code Block
module load slurm
srun -A shared -p shared -n 1 --pty /bin/bash

...

Note that when you 'exit' the interactive session, it will relinquish the resources for someone else to use. This also means that if your terminal is disconnected (you turn your laptop off, loose network etc), then the Job will also terminate (similar to ssh).

 

Warning

If your interactive request doesn't immediately find resources, it will currently not actually return you a pty - even though the job actually does run. This results in what looks like a hanging process. We are investigating... salloc first?

 

How do I submit a Batch Job?

How do I submit a Batch Job?

Warning

We are NOT support AFS as part of slurm deployment. We shall be migrating home directories and group directories over to our new storage appliances as part of SDF deployment. If you wish to access your AFS files, please copy them over to the new storage. *elaborate.

 

use the use the sbatch command, this primer needs to be elaborated:

...

Code Block
#!/bin/bash

#SBATCH --account=shared
#SBATCH --partition=shared
#SBATCH --qos=scavenger
#
#SBATCH --job-name=test
#SBATCH --output=output-%j.txt
#SBATCH --error=output-%j.txt
#
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=12
#SBATCH --mem-per-cpu=1g
#
#SBATCH --time=10:00
#
#SBATCH --gpu geforce_gtx_1080_ti:gpus 1

<commands here>

In the above example, we submit a job named 'test' and output both stdout and stderr to the same file (%j will be replaced with the Job ID). We request a single Task (think of it as an MPI rank) and that single task will request 12 CPUs; each of which will be allocated 1GB of RAM - so a total of 12GB. By default, the --ntasks will be equivalent to the number of nodes (servers) asked for. In order to aid scheduling (and potentially prioritising the Job), we limit the length of the Job to 10 minutes.

We also request a single 1080 GPU with the Job. This will be exposed via CUDA_VISIBLE_DEVICES. To specify specific GPU's, see below.

You will need an account (see below). All SLAC users have access to the "shared" partition with a quality of service of "scavenger". This is so that stakeholders of machines in the SDF will get priority access to their resources, whilst any user can use all resources as long as the 'owners' of the hardware isn't wanting to use it. As such, owners (or stakeholders) will have qos "normal" access to their partitions (of which such hosts are also within the shared partition).

...

Account NameDescriptionContact
sharedEveryoneYee
cryoemCryoEM GroupYee
neutrinoNeutrino GroupKazu
cryoem-daqCryoEM data acquitisionYee
mlMachine Learning InitiativeDaniel
suncatSUNCAT GroupJohannes
hpsHPS Group

Omar

atlasATLAS GroupYee/Wei
LCLSLCLS GroupWilko

 

What Partitions are there?

...

Partition NamePurposeContact
sharedGeneral resources; this contains all shareable reasources, including GPUsYee
mlMachine Learning Initiative GPU serversDaniel / Yee
cryoemCryoEM GPU serversYee
neutrinoNeutrino GPU serversKazu
suncatSUNCAT AMD Rome ServersJohannes
hpsHPS AMD Rome ServersOmar
fermiFermi (LAT) AMD Rome ServersRichard
atlasATLAS GPU ServersYee / Wei
lclsLCLS AMD Rome ServersWilko

...

This is often due to limited resources. The simplest way is to request less CPU (-N) or less memory for your Job. However, this will also likely increase the amount of time that you need for the Job to complete. Note that perfect scaling is often very difficult (ie using 16 CPUs will run twice as fast as 8 CPUs), so it may be beneficial to submit many smaller Jobs where possible. You can also set the --time option to specify that your job will only run upto that amount of time so that the scheduler can better fit your job in.

The more expensive option is to buy more hardware to SDF and have it added to your group/teams Partition.

time option to specify that your job will only run upto that amount of time so that the scheduler can better fit your job in.

The more expensive option is to buy more hardware to SDF and have it added to your group/teams Partition.

You can also make use of the Scavenger QoS such that your job may run on any available resources available at SLAC. This, however, has the disadvantage that should the owners of the hardware that your job runs on requires its resources, your may will be terminated (preempted) - possibly before it has completed.

 

What is QoS?

A Quality of Service for a job defines restrictions on how a job is ran. In relation to an Allocation, a user may preempt, or be preempted by other job with a 'higher' QoS. We define 2 levels of QoS:

scavenger: Everyone has access to all resources, however it is ran with the lowest priority and will be terminated if another job with a higher priority needs it

normal: Standard QoS for owners of hardware; jobs will (attempt) to run til completion and will not be preempted. normal jobs therefore will preempt scavenger jobs.

Scavenger QoS is useful if you have jobs that may be resumed (checkpointed) and if there are available resources available (ie owners are not using all of their resources).

You may submit to multiple Partition with the same QoS level:

Code Block
#!/bin/bash
#SBATCH --account=cryoem
#SBATCH --partition=cryoem,shared
#SBATCH --qos=scavenger

In the above example, a cryoem user is charging against their Account cryoem; she is willing to run the job whereever available (the use of the cryoem Partition is kinda moot as the cryoem nodes are a subset of the Shared Partition anyway).

is it possible to define multiple? ie cryoem with normal + shared with scavenger? 

How can I restrict/contraint which servers to run my Job on?

...