This page contains information in no particular order. If a lot more information is added, one should think about organizing it.

SLURM auto-completion tool

For anyone using Slurm tool often, the following utilities is really helpful: https://github.com/SchedMD/slurm/tree/master/contribs/slurm_completion_help

See what jobs are in the queue

squeue
squeue -u <username>
squeue --reservation <reservation_name>

Detailed information about a specific running or recent job

scontrol show jobid -dd <jobID>

Get information about current reservation

scontrol show res

User and experiment accounts' associations

sacctmgr show associations users=espov format=cluster,account%25,partition # list account that the user belongs to. %25 make the column larger so that the full account name is displayed.
sacctmgr list associations -p account=lcls:xpp1234 # list accounts associated with xpp1234 format=user,account%25,partition

The "format" argument can be modified to see more details. Remove it to see all (can be messy).

Partition and node information

sinfo is used to view partition and node information for a system running Slurm. 

Examples

sinfo -o "%C" -n sdfmilan[021-022,040,202-204,210-213,226,232]  
CPUS(A/I/O/T)
991/545/0/1536


( %C shows "allocated/idle/other/total") So 991 cores are still in use. With -o "%n %C"  one gets the usage per node:

sinfo -o "%n %C"  -n sdfmilan[021-022,040,202-204,210-213,226,232]
HOSTNAMES CPUS(A/I/O/T)
sdfmilan021 120/8/0/128
sdfmilan022 45/83/0/128
sdfmilan040 8/120/0/128
sdfmilan202 116/12/0/128
sdfmilan203 120/8/0/128
sdfmilan204 120/8/0/128
sdfmilan210 120/8/0/128
sdfmilan211 113/15/0/128
sdfmilan212 105/23/0/128
sdfmilan213 104/24/0/128
sdfmilan226 9/119/0/128
sdfmilan232 7/121/0/128

Priorities

Show priorities for an account: sacctmgr list associations -p accounts=<accounts>

Show priority level for a job: sprio -j <jobID>

Show priority coefficients: sacctmgr show qos format=name,priority,usagefactor

  • No labels