"Data analysis == Piece of cake"
Sections in this tutorial
It is always a good idea for the people doing analysis to be able to look at their detector images and probe intensity values. Given that a typical LCLS experiment has millions of snapshots to choose from, it is also critical that you can quickly select images of interest and set regions of interest using masks. By the end of this tutorial, you will be able to browse images, jump to images of interest, generate masks, find peaks in your images and index crystal diffraction patterns.
Citation for psocake (and other psana-based programs): @article{Damiani:zw5004, author = "Damiani, D. and Dubrovin, M. and Gaponenko, I. and Kroeger, W. and Lane, T. J. and Mitra, A. and O'Grady, C. P. and Salnikov, A. and Sanchez-Gonzalez, A. and Schneider, D. and Yoon, C. H.", title = "{Linac Coherent Light Source data analysis using {it psana}}", journal = "Journal of Applied Crystallography", year = "2016", volume = "49", number = "2", pages = "672--679", month = "Apr", doi = {10.1107/S1600576716004349}, url = {http://dx.doi.org/10.1107/S1600576716004349}, } |
Type "psocake" on your terminal to open up the GUI. For crystallography, we will need to open it in sfx mode (-m):
$ sit_setup nightly-20160823 # (Optional) To get the bleeding edge version, use the nightly build $ psocake -m sfx |
1) There are four parameters required to uniquely identify an image at LCLS. Type the (1) experiment name, (2) run number, (3) detector name, and (4) event number in the Experiment Parameters panel.
For this tutorial, we will look at experiment cxitut13, run 10, detector DscCsPad, event 11.
####################################### # Available area detectors: # ('CxiDs1.0:Cspad.0', 'DscCsPad', '') ####################################### |
CxiDs1.0:Cspad.0 is the detector name. DscCsPad is the detector alias. Psocake can understand both naming conventions.
2) You can specify the experiment parameters as command line arguments in psocake using the psana-style experiment run string. This is the recommended way of starting psocake:
$ psocake exp=cxitut13:run=10 -d DscCsPad -n 11 -m sfx |
Or you can also use the -e and -r arguments for the experiment and the run number:
$ psocake -e cxitut13 -r 10 -d DscCsPad -n 11 -m sfx |
To check psocake version:
$ psocake --version |
Don’t worry if you don’t remember these arguments. You can view argument options using --help:
$ psocake --help |
Psocake should have generated directories and files in the experiment directory. At LCLS, all experiments are stored here: /reg/d/psdm/<instrument>/<experiment>. Let's take a moment and check out our directory structure. Either open a new terminal (Remember to 'ssh psana') or use the current terminal ('Cntrl+z' to suspend psocake that is running then 'bg' to run psocake in the background), type the following command:
$ ls /reg/d/psdm/cxi/cxitut13 calib ftc hdf5 res scratch usr xtc |
calib: This is where all psana calibration is stored. Detector geometry, pedestals, gain, common mode constants, and bad pixelmap.
xtc: This is where all your raw data is stored. XTC is a simple and efficient format for storing large data. XTCs can be read using psana. Note you have 4 months to analyse your data before xtcs are moved off to tape.
scratch: This is where psocake saves all the files like .cxi and .stream. This directory is not backed up, so important files need to be move to /res.
res: This is the results directory which is backed up on tape. After completing your analysis, your results/data should be moved here.
In this section, let's learn how to mask out pixels that should not be used for analysis (such as dead pixels), mask out the jet streak at the centre of the detector, and mask out the water ring (just for fun!).
Note: the Image Panel must be in the default "greyscale" colormap for the mask colors to display properly.
1) In the mask panel, click on "Use psana mask". This will mask out the following pixels that should not be used for analysis; calib, status, edge, central, unbonded pixels, unbonded pixel neighbor pixels. These masked pixels are shown as green on the image panel.
2) On the mask panel, click on "Use streak mask". This will mask out strong intensities originating from the edges of the central asics. The streak mask varies shot-to-shot.
3) To make a donut mask over the water ring, click on "Use user-defined mask". This will bring up a cyan circle, cyan polygon and cyan square mask generators.
Select "Toggle" in Masking mode. Move the cyan circle to the centre of the detector by dragging the circle. Resize the cyan circle by dragging the diamond on the perimeter. Once you are happy with the position, click "mask circular ROI" button on the mask panel.
Increase the cyan circle again by dragging the diamond on the perimeter. Click "mask circular ROI" button on the mask panel. Because we are in the "toggle" mode, the previous mask gets toggled and disappears. The area that does not overlap with the previous mask get masked out.
To save the user-defined mask, click on "Save static mask" on the mask panel which will save the mask in the scratch folder. This will combine the green and blue masks into a single mask. For this example, your mask will be saved here:
/reg/d/psdm/cxi/cxitut13/scratch/<username>/psocake/r0010/mask.npy (unassembled 3D ndarray)
/reg/d/psdm/cxi/cxitut13/scratch/<username>/psocake/r0010/mask.txt (2D text)
You can load the user-defined mask using the "Load mask" button and selecting mask.npy.
mask.txt is compatible with the calibration manager application, calibman.
To delete the mask on the screen, select "Unmask" under Masking mode. Drag a blue circle mask generator over the detector and click "Stamp circular mask".
In this section, we will find peaks on the detector image. To find the peaks on the image, set the "Algorithm" to “Droplet” in the Peak Finder panel. Details of the peak finding algorithm is given here: Hit and Peak Finding Algorithms#Twothreshold"Dropletfinder". You should notice peaks being highlights in the Image panel.
Hover the mouse pointer over the Bragg peaks to study the intensities. The sum of the Bragg peak pixels are above 500 ADUs. Set the following values:
In the small data panel, you should see the CXIDB filename:
First things first, crystal indexing requires an accurate detector geometry. Latest CXI geometry files can be found here: Geometry history
Detector panels can manually adjusted using: calibman
If you are on a psana machine, you can run CrystFEL programs by setting up your environment:
source /reg/g/cfel/crystfel/crystfel-dev/setup-sh CFDEPDIR=/reg/g/cfel/dependencies export PATH=${CFDEPDIR}/bin:$PATH |
Indexing panel uses CrystFEL to index the diffraction patterns, so the input parameters in the indexing panel should be familiar to you if you've used indexamajig before.
CrystFEL geometry: This geometry file is automatically converted from our psana geometry to CrystFEL geometry for you. Feel free to look inside .temp.geom. If you have a CrystFEL geometry file that you know is good, you can simply type it in. Psocake will never modify this file even if you change the "detector distance" in the diffraction geometry panel. (Just don't name your geometry .temp.geom, it will get overridden). You can also deploy the CrystFEL geometry as a psana geometry by clicking "Deploy CrystFEL geometry" in the indexing panel.
Integration radii: These 3 numbers define the radius of two concentric rings about each Bragg spot. Inner ring is used to integrate the Bragg spot and the outer ring is used to estimate the background. Try adjusting these numbers and see what is being integrated on screen. It should be large enough to fit a Bragg spot inside the inner ring.
PDB: If you have a CrystFEL unitcell, you can constrain the indexing algorithms to look for this unit cell.
Indexing method: Default is mosflm-noretry, dirax. "retry" is used to speed up mosflm (it can take few seconds).
Tolerance: These 4 numbers define how much wriggle room you want for indexing. 5, 5, 5 are the tolerance level for unitcell axes a, b, c. 1.5 is the tolerance level for the angles alpha, beta, gamma.
Extra CrystFEL parameters: You can enter extra parameters for indexamajig in this field. It will be appended at the end of the command line, e.g. --profile will turn on the processing timing information.
Let's try to index another diffraction pattern at event 44.
Hopefully, you have indexed this diffraction pattern. Notice that the unitcell parameters are a bit off compared to what is expected. Let's load a CrystFEL unitcell file to help the indexer along.
CrystFEL unit cell file version 1.0 lattice_type = tetragonal centering = P unique_axis = c a = 77.05 A b = 77.05 A c = 37.21 A al = 90 deg be = 90 deg ga = 90 deg |
Psocake saves the detector images of only the hits in the .cxi file. It is likely that you may want to reindex these files to optimize the indexing rate. If you anticipate that you have finalized the indexing parameters, set 'Keep CXI images' to Off. It will delete the detector images in your .cxi file which will free up your precious disk space for doing other things.
As with peak finding, you can launch indexing jobs on multiple runs by specifying runs in the Run(s) field.
Indexing will take some time to complete. If successful, you should see a stream file in: /reg/d/psdm/cxi/cxitut13/scratch/<username>/psocake/r0010/cxitut13_10.stream
In the small data panel, type the CXIDB filename:
# Phenix source /reg/common/package/phenix/phenix-1.10.1-2155/phenix_env.sh # CCP4 source /reg/common/package/ccp4/ccp4-7.0/bin/ccp4.setup-sh |
Please send bug reports/comments:
yoon82@slac.stanford.edu