Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Running this script on a psffb node (twice so we get all the data in cache so it behaves more like the daq).  Do we also need to support det.image in the drp?

Code Block
(ana-4.0.48-py3) drp-srcf-eb004:~$ more junk.py
import time
from psana import *
ds = DataSource('exp=mfxx1005021:run=340')
det = Detector('epix10k2M')
for nevt,evt in enumerate(ds.events()):
    imgcalib = det.imagecalib(evt)
    #print(img.shape),cmpars=[0,0,0,0]) # disable common-mode with cmpars
    if nevt>=20: break
    if nevt==0: tstart=time.time() # start the timer after first event (which is slow)
print('time per evt:',(time.time()-tstart)/nevt,'nevt:',nevt)

(ana-4.0.48-py3) drp-srcf-eb004:~$lcls2$ python ~/junk.py
time per evt: 0.17038806676864623019545722007751464 nevt: 20
(ana-4.0.48-py3) drp-srcf-eb004:~$lcls2$ python ~/junk.py
time per evt: 0.17048853635787964019071054458618165 nevt: 20
(ana-4.0.48-py3) drp-srcf-eb004:~$ 

  

So 20ms/core, corresponding to ~50Hz.  For the record, with common mode (the default) the det.calib time increases to ~140ms, and det.image is ~170msSo one core can do about 5Hz.  That implies for 5kHz we need 16 nodes * 60 cores/node100 cores, or ~2 nodes.  Hopefully we don't need common-mode in the DRP.  The algorithms Mikhail runs are listed here:  Method det.calib algorithms#DetectordependentalgorithmsThis code implies that we may be able to reduce this time significantly with a "single pass" calibration (I believe Mikhail currently does two passes through the data with numpy: one for pedestals and one for gains).  Hopefully don't need common mode in the DRP

?Comparing to a simple C++ calibration algorithm:

Code Block
#include <stdint.h>
#include <stdio.h>
#include <string.h>

#define RAW_SIZE 2000000
#define NIMG 200
uint16_t raw[NIMG][RAW_SIZE];
float result[RAW_SIZE];
uint16_t peds[RAW_SIZE];
uint8_t mask[RAW_SIZE];
float gains[7][RAW_SIZE];

int main() {
    memset(mask,0,RAW_SIZE);

    for (unsigned count=0; count<NIMG; count++) {
        for (unsigned i=0; i<RAW_SIZE; i++) {
            unsigned val = raw[count][i];
            unsigned range = val&0x7000;
            result[i] = mask[i] ? 0 : ((val&0xfff)-peds[i])*gains[range][i];
        }
    }
}

With these this we see about 7ms per image):

Code Block
(ana-4.0.48-py3) drp-srcf-eb004:lcls2$ g++ -o junk junk.cc
(ana-4.0.48-py3) drp-srcf-eb004:lcls2$ time ./junk

real	0m1.432s
user	0m1.423s
sys	0m0.006s

...