Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • -x no other batch jobs run on the nodes used for this job
  • -R "span[ptile=2]" only run two processes one each node.
  • -n 9 use 9 processes for the job (1 will be the master, and 8 for calib cycles, up to 8 calib cycles will be translated simultaneously). The master process is always the last process. By using 9 processes and ptile=2, and -x, the master process runs by itself on one node. No workers will be doing I/O on the node. This gives the master the most possible resources for its job of finding calib cycles. In general, for best performance, if n is the number of processes and k the processes per node, (the ptile value) choose them so that n mod k == 1
  • The job output file (translate_39283.out, where 39283 will be whatever job number the batch system assigns) will record timing information for the master and workers.

...

A limitation on translation speed is the time it takes psana to read through all the data. I estimate this to be around 250MB/sec, putting translation at 5 times psana reading. I believe the file system supports 300-400MB/sec, but this depends heavily on the load, however this may not be a good benchmark for psana. A better benchmarch is a simple program that parses through the xtc - touching all the payloads.

Timing MPI Split Scan Translation (Old)

Below is a table of some testing results for the MPI split scan translator - however this is out of date. It does not use fastindex, nor exclusively use nodes on the high priority queues. All of this was carried out on the offline file system. A typical command was

...