It may be advantageous or necessary to do data analysis without a connection to SLAC. Below are the steps to install psana along with the data to be used in the analysis onto a local machine for later, offline use on a RHEL 5, 6 or 7 machine. In the simplest case, a single conda environment is maintained but multiple conda environments can be created and used if necessary. The steps for both cases are shown below.
Installation of a Single Conda Environment
- Use bash as your shell
- Go to https://conda.io/miniconda/
- Install the Python 2.7 64 bit bash installer (unless your OS is 32 bit but it most likely will not be)
- Find the installed file Miniconda3-latest-Linux-x86_64.sh
- It will probably be in Downloads, so do cd Downloads
- Run it with bash Miniconda3-latest-Linux-x86_64.sh
- Add miniconda2 to PATH
- The installation will automatically add the bin subdirectory of the installation to PATH in your .bashrc file
- Close and reopen terminal and check by typing conda list which will print out the installed packages
- If conda is not found, follow step 5, else skip to step 6
- If not already created, create a .bash_profile file in your home directory
Add the following script to it. This checks for the .bashrc and will run it on start up
#!/bin/bash if [ -f $HOME/.bashrc ]; then source $HOME/.bashrc fi
Run conda update -y conda to update miniconda
Install psana conda with conda install -y --channel lcls-rhel<num> psana-conda where <num> is the the RHEL version
- The following dependencies will be picked up from the channel used above:
- hdf5
- openmpi
- mpi4py
- h5py
- tables
- To use one's own build of one of these dependencies, install them first with conda install.
- Visit the psana meta.yaml file to view the version requirements for psana. These versions must be included in the environment.
- The following dependencies will be picked up from the channel used above:
Copy the experiment database from /reg/g/psdm/data/ExpNameDb/experiment-db.dat
Make a directory in the home directory with mkdir -p psdm/data/ExpNameDb
Copy the database
With rsync: rsync -t psexport:/reg/gpsdm/data/ExpNameDb/experiment-db.dat ~psdm/data/ExpNameDb/
With SCP: scp -p psexport:/reg/g/psdm/data/ExpNameDb/experiment-db.dat ~psdm/dataExpNameDb/
Copy the experiment data that will be used for analysis. This step requires patience if many runs will be copied
- For example, downloading run 54 from the experiment xpptut15, the following steps were taken:
- First create the required directories with mkdir -p psdm/xpp/xpptut15/
- Then these files were copied
From /reg/d/psdm/xpp/xpptut15/xtc to ~/psdm/xpp/xpptut15/xtc:
-bash-4.2$ ls /reg/d/psdm/xpp/xpptut15/xtc/ | grep 54 e665-r0054-s00-c00.xtc e665-r0054-s01-c00.xtc e665-r0054-s02-c00.xtc e665-r0054-s03-c00.xtc e665-r0054-s04-c00.xtc e665-r0054-s05-c00.xtc
From /reg/d/psdm/xpp/xpptut15/xtc/index to ~/psdm/xpp/xpptut15/xtc/index:
-bash-4.2$ ls /reg/d/psdm/xpp/xpptut15/xtc/index | grep 54 e665-r0054-s00-c00.xtc.idx e665-r0054-s01-c00.xtc.idx e665-r0054-s02-c00.xtc.idx e665-r0054-s03-c00.xtc.idx e665-r0054-s04-c00.xtc.idx e665-r0054-s05-c00.xtc.idx
From /reg/d/psdm/xpp/xpptut15/xtc/smalldata to ~/psdm/xpp/xpptut15/xtc/smalldata:
-bash-4.2$ ls /reg/d/psdm/xpp/xpptut15/xtc/smalldata | grep 54 e665-r0054-s00-c00.smd.xtc e665-r0054-s01-c00.smd.xtc e665-r0054-s02-c00.smd.xtc e665-r0054-s03-c00.smd.xtc e665-r0054-s04-c00.smd.xtc e665-r0054-s05-c00.smd.xtc
- And the entire directory /reg/d/psdm/xpp/xpptut15/calib to ~/psdm/xpp/xpptut15/calib
- Copy using either rsync or SCP. SCP may be simpler in this case because it copies the data of a symbolic link which is what is desired.
- For example, downloading run 54 from the experiment xpptut15, the following steps were taken:
- Two environment variables must be set, SIT_DATA and SIT_PSDM_DATA by adding the following commands to the .bash_profile file
- export SIT_DATA=$HOME/psdm/data
- export SIT_PSDM_DATA=$HOME/psdm
- If step 5 was not done, create a .bash_profile file instead and begin it with #!/bin/bash
Now psana conda can be used. If new packages need to be added, removed, etc., the simple conda comands can be used. A link the specifics of these commands is given in the section below.
Installation of Multiple Conda Environments
More information on managing Conda environments can be found here. Begin by following the steps above for the single environment but do not install psana conda yet since it will be installed when the new environment is created (step 8) or set the environment variables (step 10). It is recommended to have these variables set when the environment is activated and unset when deactivated. First create the environment with conda create --name <environment name> -c lcls-rhel<num> psana-conda where <environment name> is the name of the environment to be created and <num> is the RHEL version. Then follow the steps below, for a given environment examplenv, which were pulled from the link given before.
- Change directories to the environment directories like cd ~/miniconda2/env/examplenv (the actual path may be slightly different but it will be under env in the conda package
- Create the following directories and files:
- mkdir -p etc/conda/activate.d
- mkdir -p etc/conda/deactivate.d
- touch etc/conda/activate.d/env_vars.sh
- touch etc/conda/deactivate.d/env_vars.sh
- Add the following to this files
The complicated piece is that SIT_DATA must include the sub-directory 'data' to your Conda environment, as well as where the experiment-db.dat file is. To etc/conda/activate.d/env_vars.sh:
#!/bin/sh # Make sure to replace examplenv with the actual environment name export SIT_DATA=$HOME/miniconda2/envs/examplenv/data:$HOME/psdm/data export SIT_PSDM_DATA=$HOME/psdm
To /etc/conda/deactivate.d/env_vars.sh:
#!/bin/sh unset SIT_DATA unset SIT_PSDM_DATA
Now examplenv will have the correct environment variables when activated with the command source activate examplenv and deactivate with source deactivate. And packages may be added, removed, upgraded, downgraded, etc. from this conda environment like usual. For information on this, follow link given at the beginning of this section.
Useful Commands
To copy files from one machine to another that aren't huge (i.e. hundreds of GBs), rsync and SCP are good choices. Their basic use is like so:
- With rsync: rsync -rt <machine copying from>:<file/directory name being copied> <directory copying to>. The -r tag will copy recursively, so it is necessary if a directory is being copied but not necessary if only a file is being copied. The -t tag preserves the modification times.
With SCP: scp -rp <machine copying from>:<file/directory name being copied> <directory copying to>. Like with rsync, the -r tag copies recursively and the -p tag has the same functionality as rsync's -t tag.
Below are several useful Conda commands that may have been stated above:
- To update a package, simply use conda update <package name> and this includes psana conda. So if psana conda needs to been updated, just do conda update -y psana-conda. The -y tag removes the "Proceed ([y]/n)?" question that is asked before installing/updating or whatever is being done.
- Packages can be similarly be installed and uninstalled replacing update with install or uninstall above.
- If using multiple packages, one can activate an environment with source activate <env name> and deactivate an environment with source deactivate <env name>.
- More commands can be found here.
Example
It's important to test whether or not the above steps actually worked. To do this, you can use any code that imports experiment data with psana. Below is the sample code from the Event Iteration page from the Building Blocks section, here. If following the multiple environments steps, make sure to activate the environment with the command source activate examplenv.
from psana import * ds = DataSource('exp=xpptut15:run=54:smd') nevent = 0 for evt in ds.events(): nevent += 1 if nevent == 3: break print 'Processed', nevent, 'events.'
If successful, you should see the output:
Processed 3 events.