You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 33 Next »

Getting an Account

You will need a valid SLAC UNIX account in order to use the LCLS computing system. The instructions for getting a SLAC UNIX account are here:

http://www-ssrl.slac.stanford.edu/lcls/users/logistics.html#compaccts

Your UNIX account must be enabled in the LCLS system in order to have access to data and elog. This happens automatically if your account is created with XU as its primary group. If your primary UNIX group is not XU, make a request of enabling your account in the LCLS system by sending an email to:

pcds-help@slac.stanford.edu

If you forgot your password or if your account has been disabled send an email to:

account-services@slac.stanford.edu

Getting Access to the System

You can get into the LCLS photon computing system by ssh'ing to:

psexport.slac.stanford.edu

From these nodes you can move data files in and out of the system and you can connect to the bastion hosts:

pslogin

Note that, from within SLAC, you can directly connect to the bastion hosts without going through psexport.

The SLAC wireless visitor network is not considered part of SLAC so you'll need to go through psexport when using your laptop on-site.

From the bastion hosts you can then reach the analysis nodes (see below).

Each control room has a number of nodes for local login. These nodes have access to the Internet and are named psusr<id>.

The controls and DAQ nodes used for operating an instrument work in kiosk mode so you don't need a personal account to run an experiment from the control room. Remote access to these nodes is not allowed for normal users.

Running the Analysis

The analysis framework is documented in the Data Analysis page. This section describes the nodes which are available for running the analysis.

Interactive Pools

In order to get access to the interactive nodes, connect to the addresses psananeh or psanafeh. A load-balancing mechanism will connect you to the least loaded of the nodes in the pool:

ssh psananeh
ssh psanafeh

Each pool is currently made of six servers with the following general specifications:

  • 8-cores, Opteron 2384, 8GB, diskless, 10Gb/s

Each node in the interactive pools has one single user Matlab license.

Batch Farm

There are batch farms located in the NEH and FEH. Depending on your data access you may need to submit jobs to a specific farm. This can be accomplished by submitting to the appropriate LSF batch queue. Refer to the table below.

Please note that the batch queue lclsq has been renamed psnehq (the old name will remain active for a couple of months).

Multi-core OpenMPI jobs should be run in either the psnehmpiq or psfehmpiq batch queue, see the following section on "Submitting OpenMPI Batch Jobs".

Experimental Hall

Queue

Nodes

Data

Comments

NEH

psnehq

psana11xx,psana12xx

ana01, ana02

Jobs <= 6 cores

 

psnehmpiq

psana11xx,psana12xx

ana01, ana02

OpenMPI jobs > 6 cores, preemptable

FEH

psfehq

psana13xx,psana14xx

ana11, ana12

Jobs <= 6 cores

 

psfehmpiq

psana13xx,psana14xx

ana11, ana12

OpenMPI jobs > 6 cores, preemptable

You can find more LCLS specific information about LSF in this PDF file. For a more detailed description and more LSF commands, please see:

http://www.slac.stanford.edu/comp/unix/unix-hpc.html

The batch farm is made of eighty servers with the following general specifications:

  • 12 cores (24 with Hyperthreading), Xeon X5675, 24GB memory, 500GB disk, QDR IB
Submitting Batch Jobs

Login first to pslogin (from SLAC) or to psexport (from anywhere). From there you can submit a job with the following command:

bsub -q psnehq -o <output file name> <job_script_command>

For example:

bsub -q psnehq -o ~/output/job.out my_program

This will submit a job (my_program) to the queue psnehq and write its output to a file named ~/output/job.out.

You may check on the status of your jobs using the bjobs command.

Submitting OpenMPI Batch Jobs

The RedHat supplied OpenMPI packages are installed on pslogin, psexport and all of the psana batch servers.

The system default has been set to the current version as supplied by RedHat.

$ mpi-selector --query
default:openmpi-1.4-gcc-x86_64
level:system

Your environment should be set up to use this version (unless you have used RedHat's mpi-selector script, or your login scripts, to override the default). You can check to see if your PATH is correct by issuing the command which mpirun. Currently, this should return /usr/lib64/openmpi/1.4-gcc/bin/mpirun. Future updates to the MPI version may change the exact details of this path.

In addition, your LD_LIBRARY_PATH;should include /usr/lib64/openmpi/1.4-gcc/lib (or something similar).

For notes on compiling examples; please see:

http://www.slac.stanford.edu/comp/unix/farm/mpi.html 

The following are examples of how to submit OpenMPI jobs to the PCDS psnehmpiq batch queue:

bsub -q psnehmpiq -a mympi -n 32 -o ~/output/%J.out ~/bin/hello

Will submit an OpenMPI job (-a mympi) requesting 32 processors (-n 32) to the psnehmpiq batch queue (-q psnehmpiq).

bsub -q psfehmpiq -a mympi -n 16 -R "span[ptile=1]" -o ~/output/%J.out ~/bin/hello

Will submit an OpenMPI job (-a mympi) requesting 16 processors (-n 16) spanned as one processor per host (-R "span[ptile=1]") to the psfehmpiq batch queue (-q psfehmpiq).

bsub -q psfehmpiq -a mympi -n 12 -R "span[hosts=1]" -o ~/output/%J.out ~/bin/hello

Will submit an OpenMPI job (-a mympi) requesting 12 processors (-n 12) spanned all on one host (-R "span[hosts=1]") to the psfehmpiq batch queue (-q psfehmpiq).

Data Storage

LCLS provides space for all your experiment's data at no cost for you. This includes the measurements as well as the data derived from your analysis. Your data are available as XTC files or, on demand, as HDF5 files.

Short-term Storage

All your data is available on disk for one year after data taking. The path name is /reg/d/psdm. The data files are currently stored in a Lustre file system. Each experiment is allocated three directories: xtc, scratch and hdf5. The xtc directory contains the raw data from the DAQ system, hdf5 directory is for data files in HDF5 format. Contents of xtc and hdf5 directories are archived to tape. The scratch directory is not backed up. Please write the output of your analysis to the scratch area and not in your NFS space. Keep your analysis code under your NFS home or under your NFS group space (if you have one). Your NFS space is backep up. Please try to not keep more than 10GB under your NFS home or we may ask you to clean up.

Long-term Storage

After one year, your data files are removed from disk. The XTC and HDF5 files remain stored on tape for up to 10 years. LCLS may restore your data from tape back to disk for you to access. Send an email to pcds-help@slac.stanford.edu to have your data restored to disk. Restoring the data to disk more than once will require the approval of the LCLS management.

Data Export

There is a web interface to the experimental data accessible via

https://pswww.slac.stanford.edu/apps/explorer

The web interface also allows you to generate file lists that can be fed to the tool you use to export the data from SLAC to your home institution. You can use psexport for copying your data.

The recommended tools for exporting the data offsite are bbcp and Globus Online. The former, bbcp, is slightly simpler to setup. On the other hand Globus Online is more efficient when transferring large amount of data because it babysits the overall process by, for example, automatically restarting a failed or stalled transfer. The performance of the two tools is very similar.

Printing

The following printers are available in the NEH building from all the UNIX nodes:

Info

Location

Device URI

Dell 3130

AMO Control Room

lpd://dellcolor-neh-amo1/lp

Dell 3130

AMO Control Room

lpd://dellcolor-neh-amo2/lp

Dell 3130

SXR Control Room

lpd://dellcolor-neh-sxr1/lp

Dell 3130

SXR Control Room

lpd://dellcolor-neh-sxr2/lp

Dell 3130

XPP Control Room

lpd://dellcolor-neh-xpp1/lp

Dell 3130

XPP Control Room

lpd://dellcolor-neh-xpp2/lp

HP Color LaserJet CP3525

Bldg 950 corridor ground floor

ipp://hpcolor-neh-corridor/ipp/

Xerox WorkCentre 5675

Bldg 950 Rm 218, Jason Alpers

ipp://hpcolor-neh-laser/ipp/

HP Color LaserJet 4700

Bldg 950 Rm 204, Ray Rodriguez

ipp://hpcolor-neh-ray/ipp/

HP LaserJet 4350

Bldg 950 Rm 203

ipp://hpcolor-neh-srvroom/ipp/

  • No labels