For anyone using Slurm tool often, the following utilities is really helpful: https://github.com/SchedMD/slurm/tree/master/contribs/slurm_completion_help
squeue squeue -u <username> squeue --reservation <reservation_name>
scontrol show jobid -dd <jobID>
(scontrol does not show information about jobs that have completed more than a few minutes ago)
For detailed reporting and stats on jobs, use sacct, e.g. getting all jobs for user <USER> that started after starttime (e.g.: 2024-06-15):
export FMT="reservation,jobid,jobname,User,reqcpus,ntasks,reqmem,averss,maxrss,elapsed,state%20,exitcode,Submit,Start,End,Account%17,Partition,AveCpu,NodeList%30 --unit=M" sacct --format=${FMT} -u <USER> --starttime 2024-06-15
or for specific job(s) and/or account(s) using additional format options
sacct -a -j 50612401 -A lcls:xcsl1018322 -o JobID,JobName,Partition,Account%18,AllocCPUS,Nodelist%24,NNodes,start,elapsed,workdir%60,submitline%160
scontrol show res
sacctmgr show associations users=espov format=cluster,account%25,partition # list account that the user belongs to. %25 make the column larger so that the full account name is displayed. sacctmgr list associations -p account=lcls:xpp1234 # list accounts associated with xpp1234 format=user,account%25,partition
The "format" argument can be modified to see more details. Remove it to see all (can be messy).
sinfo is used to view partition and node information for a system running Slurm.
Examples
|
( %C shows "allocated/idle/other/total") So 991 cores are still in use. With -o "%n %C"
one gets the usage per node:
|
Show priorities for an account: sacctmgr list associations -p accounts=<accounts>
Show priority level for a job: sprio -j <jobID>
Show priority coefficients: sacctmgr show qos format=name,priority,usagefactor