Users need to do the following

use bash as your default shell, not tcsh or csh (conda doesn't support csh)
- to switch to bash at LCLS (a separate computing environment than the rest of SLAC) email pcds-it-l at slac.stanford.edu
Either
- run the command: source /reg/g/psdm/etc/ana_env.sh as usual (explained in the psana python setup example)
- run the command: source conda_setup
Or, if you don't need to use the old RPM based release system
- run the command: source /reg/g/psdm/bin/conda_setup

...

Using conda_setup from a Script

Since When writing a script that will use the conda_setup command to create a conda environment, since conda_setup processes command line arguments, and your script may take command line arguments, a best practice is to do

...

Here is an example

In [1]: import anarelinfo
In [2]: anarelinfo.version
Out[2]: 'psana-conda-1.0.3'
In [3]: anarelinfo.pkgtags
Out[3]: 
{'AppUtils': 'V00-07-00',
 'CSPadPixCoords': 'V00-03-30',
...

GPU Work

LCLS has some GPU resources with some software setup for use. See table below

node	CUDA	GPU card(s)	RAM	Compute Capability	notes
psanagpu101
psanagpu102	7.5	Tesla K40	12 GB	3.5	This is the only card we have with a modern enough compute capability for deep learning frameworks that rely on the nvidia cudnn (like tensorflow)
psanagpu103

We are still developing infrastructure and configuration for these nodes, but presently, if one does

ssh psanagpu102
source conda_setup --dev --gpu

then you will be activating a python 2.7 conda environment for working with the GPU. It is mostly the same as the main environment with psana, but has these differences:

includes the nvidia cudnn for deep learning frameworks like tensorflow
adds paths to PATH, LD_LIBRARY_PATH, and CPATH so that you can work with the CUDA installation, and the nvidia cudnn
for packages like tensorflow, that are compiled differently to work with the GPU, includes the GPU version of that package rather then the CPU version
- presently, tensorflow is the only such package that is compiled differently for the GPU - that is all other packages in this environment are the same as the standard psana environment.
- packages like theano can be dynamically configured to use the GPU, so it is the same package between this gpu and non gpu environment

Using the cuDNN

Before using the nvidia cudnn (by working with tensorflow or keras in the gpu environment, or configuring theano to use it), register with the NVIDIA Accelerated Computing Development program at this link:

https://developer.nvidia.com/accelerated-computing-developer

Per the nvidia cuDNN license, we believe all users must register before using it, but don't worry, the nvidia emails (if you opt to receive them) are quite interesting!

Shared Resource

Presently, the GPU's are only available through interactive nodes. There is no batch management of them to assign GPU resources to users. Be mindful that other users on a node like psanagpu102 may be using the GPU.

The main issue is that GPU memory can become a scarce resource.

Make use of the command

nvidia-smi

to see what other processes are on the gpu and how much memory they are using. Use

top

to identify the names of other users and communicate with them, or us, to manage multi-use issues.

Limit GPU Card Use

If you are on a node with more than one GPU card, you can use cuda environment variables to restrict any CUDA based program, to only see a few of the GPU cards. For example, if there are two cards, they will be numbered 0 and 1 by CUDA. You could do

export CUDA_VISIBLE_DEVICES=1

and any command you run will only see that one GPU. Likewise, to just start one process with a limited view, do

CUDA_VISIBLE_DEVICES=1 ipython

will start an interactive ipython session where tensorflow will only see device 1. Tensorflow will call the one device it sees device 0.

Tensorflow: Limit GPU Memory on a Card

With tensorflow, you can write your code to only grab the GPU memory that you need:

 with tf.device('/gpu:0'): # this with statement may not be necessary
    config = tf.ConfigProto()
    config.gpu_options.allow_growth = True
    config.allow_soft_placement=True
    with tf.Session(config=config) as sess:
         # now your program, all variables will default to going on the GPU, and
         # any that shouldn't go on a GPU will be put on the CPU.

Configuration Subject to Change

At this point there are very few people using the GPU and the configuration of GPU support is subject to change. Presently the gpu conda environment is only built in the development rhel7 conda installation (thus the --dev switch for conda_setup above).

Page tree

Versions Compared

Old Version 16

New Version Current

Key

Using conda_setup from a Script

GPU Work

Using the cuDNN

Shared Resource

Limit GPU Card Use

Tensorflow: Limit GPU Memory on a Card

Configuration Subject to Change

Page tree

Page History

Versions Compared

Old Version 16

New Version Current

Key

Using conda_setup from a Script

GPU Work

Using the cuDNN

Shared Resource

Limit GPU Card Use

Tensorflow: Limit GPU Memory on a Card

Configuration Subject to Change