Content
Installation
Run server
Shell
Shell is a manual command line interface.
Connection to DB in python
Connection time is 50-150ms depending on host and time.
Tentative model of the calibration store
Experiment-centric calibration data base
All meta-data information is accessible through a single-level document.
Detector-centric calibration data base
Error rendering macro 'code': Invalid value specified for parameter 'com.atlassian.confluence.ext.code.render.InvalidValueException'# References or DBRefs for detectors dbdet = client['calib-cspad'] col1 = dbdet['cspad-0-cxids1-0'] col2 = dbdet['cspad-0-cxids2-0'] col3 = dbdet['cspad-0-cxidsd-0'] col4 = dbdet['cspad-0-xcsendstation-0'] col5 = dbdet['cspad-0-xppgon-0'] col6 = dbdet['cspad-0-sxrbeamline-1'] col7 = dbdet['cspad-0-mectargetchamber-0'] # Document content for dbdet doc = { "_id":ObjectId("..."), "ref_id": ObjectId("534009e4d852427820000002"), etc... }
Essentially document in the detector collection has a reference to the data in the experiment collections.
Data flow for documents less than 16 MB
Preparation of data
- Preparation of cspad data in text/unicode format for inserting takes ~1sec.
- Only limited precision data can be saved due to limit on document size 16MB.
Inserting data
Insertion time is 110-180ms.
Find data
Finding data time is 50-60us
Unpack data
Time to unpack is 350ms.
Data flow for large documents
Timing test is done for mongod running on psanaphi105 and scripts on psanagpu106.
Initialization
Time to connect 116-150ms.
Put
Time to save data 330-420ms.
Document meta-data with reference to data preparation time is 43-53us.
Insert metadata time 0.5-0.6ms.
Get
docs = col.find({"time_stamp" : "2018-01-25T09:33:10PST"}) doc = docs[0]
Metadata find and get time: 0.7ms
s = fs.get(doc['data_id']).read() nda = gu.np.fromstring(s)
Data extraction time: 96ms. Thus returned array is "flattend" and needs to be shaped.
Summary
- MongoDB structure has limitations in number of levels and document size.
- server may have many DBs
- DB is a container for collections
- collection is a group of documents
- document is a JSON/BSON object of key:value pairs (dictionary). Each value may be dictionary itself etc, but further structure levels are not supported by DB structure.
- document size has hardwired limit 16MB (in 2010 increased from 4 to 16MB and devs do not want to change it). CSPAD 2Mpix*8byte(double) = 16MB, but we may expect larger detectors like Jungfrau, Epix, Andor, etc.
- Larger data size is suggested to save using GridFS; split data for chanks and save chunks in the same DB in different collections.
- JSON (text) object in MongoDB is presented in unicode...(UTF-8). Data should be converted to unicode force and back in saving retrieving.
- schema-less DB looks interesting to certain extents, but in order to find something in DB there should be a schema...
- GridFS works fine with document size>16GB.
References
- https://docs.mongodb.com/manual/tutorial/install-mongodb-on-linux/
- http://api.mongodb.com/python/current/tutorial.html
Overview
Content Tools