Page History
...
Running this script on a psffb node (twice so we get all the data in cache so it behaves more like the daq). Do we also need to support det.image in the drp?
Code Block |
---|
(ana-4.0.48-py3) drp-srcf-eb004:~$ more junk.py import time from psana import * ds = DataSource('exp=mfxx1005021:run=340') det = Detector('epix10k2M') for nevt,evt in enumerate(ds.events()): imgcalib = det.imagecalib(evt) #print(img.shape),cmpars=[0,0,0,0]) # disable common-mode with cmpars if nevt>=20: break if nevt==0: tstart=time.time() # start the timer after first event (which is slow) print('time per evt:',(time.time()-tstart)/nevt,'nevt:',nevt) (ana-4.0.48-py3) drp-srcf-eb004:~$lcls2$ python ~/junk.py time per evt: 0.17038806676864623019545722007751464 nevt: 20 (ana-4.0.48-py3) drp-srcf-eb004:~$lcls2$ python ~/junk.py time per evt: 0.17048853635787964019071054458618165 nevt: 20 (ana-4.0.48-py3) drp-srcf-eb004:~$ |
So 20ms/core, corresponding to ~50Hz. For the record, with common mode (the default) the det.calib time increases to ~140ms, and det.image is ~170msSo one core can do about 5Hz. That implies for 5kHz we need 16 nodes * 60 cores/node100 cores, or ~2 nodes. Hopefully we don't need common-mode in the DRP. The algorithms Mikhail runs are listed here: Method det.calib algorithms#DetectordependentalgorithmsThis code implies that we may be able to reduce this time significantly with a "single pass" calibration (I believe Mikhail currently does two passes through the data with numpy: one for pedestals and one for gains). Hopefully don't need common mode in the DRP
?Comparing to a simple C++ calibration algorithm:
Code Block |
---|
#include <stdint.h> #include <stdio.h> #include <string.h> #define RAW_SIZE 2000000 #define NIMG 200 uint16_t raw[NIMG][RAW_SIZE]; float result[RAW_SIZE]; uint16_t peds[RAW_SIZE]; uint8_t mask[RAW_SIZE]; float gains[7][RAW_SIZE]; int main() { memset(mask,0,RAW_SIZE); for (unsigned count=0; count<NIMG; count++) { for (unsigned i=0; i<RAW_SIZE; i++) { unsigned val = raw[count][i]; unsigned range = val&0x7000; result[i] = mask[i] ? 0 : ((val&0xfff)-peds[i])*gains[range][i]; } } } |
With these this we see about 7ms per image):
Code Block |
---|
(ana-4.0.48-py3) drp-srcf-eb004:lcls2$ g++ -o junk junk.cc (ana-4.0.48-py3) drp-srcf-eb004:lcls2$ time ./junk real 0m1.432s user 0m1.423s sys 0m0.006s |
...