Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Two versions of split scan mode have been implemented. The most recent (soon to be available in ana-0.13.2 and later) is an MPI (Message Passing Interface) based version. This is the recommended way to Split run split scan mode. However the previous version is still available and is documented below.

...

HDF5 presently has little support for reading a file that is being created, and in general does not recommend this. However the master file is written in a way to support this as well as possible. Most all links from When using the MPI split scan translator, links are not added to the master file to after the calib cycle files are not created until the calib cycle file is finished. The exception is done. Thus it is always safe to traverse those links. With the non-MPI split scan translator this is true except for the last N links, where N is the number of jobs running. These links may be written before the calib cycle files they link to are finished. To see updates in the master file, users may need to shut down programs like Matlab and h5py and restart them. It is not sufficient to close and reopen the master file within a Python or Matlab session.

...

hdf file creation parameters
Only NoSplit is implemented - no family or split drivers.

In general a number of o2o-translate options are no longer supported.  In particular:
-G (long names like CalibCycle:0000 instead of CalibCycle) is always on.

Speed

Comparison with o2o-translate

psana-translate runs about 10% slower than o2o-translate does.

Performance testing was done during November/December of 2013.  Both o2o-translate and psana-translate worked through a 92 GB xtc file using compression=1 on the rhat6 machine psdev105.  They read and wrote the data from /u1. They both used the non-parallel compression library.  o2o-translate produced a 68GB file in 65 minutes and psana-translate produced a 65GB file in 70 minutes.  (Speeds of about 22MB/sec).  Production runs will use the parallel compression library and are expected to run at faster speeds (about 50MB/sec).faster.

Single psana-translate

I parsed through psana-translate logs on 9/25/2014. An estimate of the median speed is 51MB/sec. The 25th and 75th percentile speeds look to be 37MB/sec and 69MB/sec. I've made some effort to parse logs that only include full translations with the default compression. Among such translations, I would expect the differences on speeds to be based on how much cspad is being calibrated, the load on the filesystem and cores, and how much damaged data appears and is not processed (will make the speed seem artificially higher). I assume that the size of the xtc files on disk is the amount of data translated.

A limitation on translation speed is the time it takes psana to read through all the data. I estimate this to be around 250MB/sec, putting translation at 5 times psana reading. I believe the file system supports 300-400MB/sec, but this depends heavily on the load, however this may not be a good benchmark for psana. A better benchmarch is a simple program that parses through the xtc - touching all the payloads.

Timing MPI Split Scan Translation

Below is a table of some testing results for the MPI split scan translator. All of this was carried out on the offline file system. A typical command was

bsub -q psanaq -a mympi -n 12 -R "span[ptile=1]" h5-mpi-translate -m cspad_mod.CsPadCalib,Translator.H5Output -o Translator.H5Output.output_file=/reg/data/ana01/temp/davidsch/out.h5 exp=xpptut13:run=3

With the environment variables set to load the parallel compression library. All jobs run on the psanaq and write to ana01. Two of the jobs only had one calib cycle (xpp74813:run=69 and xpp40312:run=48) from which we might say the baseline translation speeds were 25-30 mb/sec in these tests. When investigating why this was less than the median 51MB/sec, I did see a high load on the system -however it may also be the overhead of having two readers. When running mpi split scan on a run with only one calib cycle - two different nodes will hit the same xtc on disk for a while. At some point the master process will get further along in the files.

Some of the columns are explained here:

  • WJobs - how many worker jobs. If there are at least 100 events in a calib cycle, it is one calib cycle per worker job. Otherwise as many calib cycles as needed to hit 100.
  • CC/WJ - average number of calib cycles per worker job. Usually 1.
  • mread - time for the master to read through the data. The rest of the time it waits for worker nodes to finish.
  • calib - how many distinct cspad sources were calibrated in the h5.
  • WJtime - average time for one worker job. In principle, something close to mread + WJtime would be the minimum time we expect to achieve.
    *wn - for the n workers, report the number of worker jobs done, and the percent of the total time the worker was translating, vs. idly waiting for a worker job. For example, w0=19/96% means worker 0 processed 19 worker jobs in 96% of the total time.

exp

size(GB)

WJobs

CC/WJ

evts/CC

nodes

time

MB/sec

mread

calib

WJtime

w0

w1

w2

w3

w4

w5

w6

w7

w8

w9

w10

xppi0214:run=279

100.0

28

1.0

634.1

10

9.8min

174.1

66%=6.5min

1

2.4min

3/78%

4/95%

4/90%

3/68%

3/69%

3/68%

3/67%

3/71%

2/70%

  

xpp74813:run=69

1.00

1

1.0

160.0

5

0.7min

25.5

5.8%=0.0min

0

0.6min

1/93%

0/0%

0/0%

0/0%

       

xpp72213:run=146

2.03

31

1.0

243.0

5

0.8min

43.9

77%=0.6min

0

0.1min

9/72%

5/86%

9/70%

8/61%

       

xpp72213:run=122

4.00

41

1.0

363.0

5

1.0min

68.3

91%=0.9min

0

0.1min

11/85%

11/84%

10/81%

9/70%

       

xpp65013:run=40

16.02

68

1.5

79.6

5

2.5min

109.4

66%=1.7min

0

0.1min

17/88%

18/85%

17/85%

16/86%

       

xpp61412:run=75

32.19

138

1.2

99.2

5

4.3min

127.8

73%=3.1min

0

0.1min

32/89%

35/88%

36/89%

35/87%

       

xppa1814:run=173

64.04

10

1.0

1203.0

6

18.0min

60.7

66%=11.9min

0

5.4min

3/93%

3/78%

2/64%

1/35%

1/32%

      

xppi0214:run=325

127.57

36

1.0

629.1

12

12.0min

181.4

65%=7.8min

1

2.7min

4/74%

2/81%

2/74%

3/74%

4/85%

4/88%

4/68%

4/70%

3/78%

3/66%

3/49%

xpp40312:run=48

390.51

1

1.0

444427.0

12

220.0min

30.3

26%=57.2min

0

220.0min

1/1e+02%

0/0%

0/0%

0/0%

0/0%

0/0%

0/0%

0/0%

0/0%

0/0%

0/0%

xppa4513:run=173

478.13

204

1.0

483.0

12

52.0min

156.9

72%=37.4min

1

2.6min

14/98%

15/94%

18/93%

19/96%

19/92%

19/93%

17/93%

22/93%

18/90%

25/91%

18/90%

xppc3614:run=271

390.25

125

1.0

603.0

12

100.0min

66.6

87%=87.0min

1

4.4min

17/85%

20/75%

17/73%

16/49%

9/46%

8/35%

8/39%

8/42%

8/37%

7/37%

7/34%

Technical Difference's with o2o-translate

...