Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

In 2021 LCLS switch to the SLURM batch system.

Information on submitting jobs to the SLURM system at LCLS can be found on this page: Submitting SLURM Batch Jobs

Information on the Automatic Run Processing system (ARP) can be found on this page: Automatic Run Processing (ARP).  This is also usable at sites like NERSC and SDF.

A "cheat sheet" showing similar commands on LSF and SLURM can be found here: https://slurm.schedmd.com/rosetta.pdf

Refer to the table below for the batch resources available in psana.

Table of Contents

Batch Nodes/Queues

Depending on your data access you may need to submit jobs to a specific farm. This is accomplished by submitting to the appropriate LSF batch queue. Refer to the table below. Jobs for the current experiment should be submitted to the high priority queues psnehhiprioq and psfehhiprioq running against the Fast Feedback storage layer (FFB) located at /reg/d/ffb/<hutch>/<experiment> as shown HERE. Jobs for the off-shift experiment should be submitted to psnehprioq and psfehprioq. Only psneh(hi)prioq/psfeh(hi)prioq should access the FFB.  When in doubt, use psanaq.

Submit your job from an interactive node (where you land after doing ssh psana). LSF will run the submitted job on the specified queue using nodes listed in the table below. All nodes in the queues listed below run rhel7RHEL7. By submitting from an interactive node (, also running rhel7)RHEL7, you will ensure that your job inherits a rhel7 environment.

 

RHEL7 environment.

Note 1: Jobs for the current experiment can be submitted to fast feedback (FFB) queues, which allocate resources for the most recent experiments. The FFB queues in the tables below are for LCLS-II experiments (TMO, RIX and UED). The FEH experiments (LCLS-I, including XPP) can submit FFB jobs to the new Fast Feedback System.

Warning

As of February 2023, the offline compute resources have been consolidated into the psanaq. The priority queues have been removed.

sQueue name

Node names on SLURM queuesNumber of Nodes

Location

Queue

Nodes

Data

Comments

Throughput

(

[Gbit/s

)

]

Cores/
Cores/
Node

RAM

(

[GB/node

)

]

Default

Time limit 

Time
psanaq
Limit 

 psana15xx

Building 50psanaqpsana11xx, psana12xx,psana13xx, psana14xxALL (no FFB)Primary psana queue40960122448hrs psdebugqsame as psanaqsame as psanaqSHORT DEBUGGING ONLY (preempts psanaq jobs)4024122410min psanaidleqpsana11xx, psana12xx,psana13xx, psana14xx
 Jobs preemptable by psanaq40960122448hrs

NEH

psnehhiprioq

psana15xx

FFB for AMO, SXR, XPP

Current NEH experiment on FFB ONLY

402881612824hrs 

psnehprioq

psana15xx

FFB for AMO, SXR, XPP

Off-shift NEH experiment on FFB ONLY

402881612824hrs

 

psnehq

psana15xx

 

Jobs preemptable by psneh(hi)prioq

102881612848hrs

FEH

psfehhiprioq

psana16xx

FFB for XCS, CXI, MEC

Current FEH experiment on FFB ONLY

402881612824hrs 

psfehprioq

psana16xx

FFB for XCS, CXI, MEC

Off-shift FEH experiment on FFB ONLY

402881612824hrs

 

psfehq

psana16xx

 

Jobs preemptable by psfeh(hi)prioq

102881612848hrs

Submitting Jobs

Information on submitting jobs to the SLURM system at LCLS can be found on this page: Submitting SLURM Batch Jobs

Information on the submitting jobs to the (deprecated) LSF batch system can be found on this page: Submitting LSF Batch Jobs

Information on the Automatic Run Processing system (ARP) can be found on this page: Automatic Run Processing.  This is also usable at sites like NERSC and SDF.

...

 psana16xx

34Primary psana queue401612848hrs
psanagpuqpsanagpu113-psanagpu1186GPU nodes101612848hrs

...