Data extraction time: 96ms. Thus returned array is "flattend" and needs to be shaped.

Interface from Murali

2018-08-03 e-mail from Murali:

I have installed Mongo 4.0 on psdb-dev. I was hoping to use their REST service but this seems to have been deprecated and eliminated since 3.6.

So, I knocked a quick web service and have proxied it from pswww. This web service (https://github.com/slaclab/psdm_mongo_ws) is a suggestion only; please let me know if you need something different.

These are examples of getting data over HTTPS from a batch node from within cori; needless to say, the URL prefix is https://pswww.slac.stanford.edu/calib_ws

Two users:

mongo --host=psdb-dev --port 9306 -u "dubrovin" -p "...." --authenticationDatabase "admin"
mongo --host=psdb-dev --port 9306 -u "calibuser" -p "...." --authenticationDatabase "admin"

Test commands:

curl -s "https://pswww.slac.stanford.edu/calib_ws/test_db/test_coll/5b649a9df59ae00bda110168"
curl -s "https://pswww.slac.stanford.edu/calib_ws/test_db/test_coll"
curl -s "https://pswww.slac.stanford.edu/calib_ws/test_db/test_coll?item=planner&size.uom=cm"
curl -s "https://pswww.slac.stanford.edu/calib_ws/test_db/test_coll?query_string=%7B%20%22item%22%3A%20%22planner%22%2C%20%22qty%22%3A%2075%20%7D%0A"
curl -s "https://pswww.slac.stanford.edu/calib_ws/" - get string of databases
curl -s "https://pswww.slac.stanford.edu/calib_ws/test_db" - get list of collections in database
curl -s "https://pswww.slac.stanford.edu/calib_ws/cdb_cxic0415/cspad_detnum1234?ctype=pedestals&data_size=2296960&run=74" - find and return document for query
curl -s "https://pswww.slac.stanford.edu/calib_ws/cdb_cxic0415/cspad_detnum1234/gridfs/5b6893e81ead141643fe4344" - get document with constants from GridFS using document id
curl -s "https://pswww.slac.stanford.edu/calib_ws/cdb_cxic0415/cspad_detnum1234/gridfs/5b6893e81ead141643fe4344" - DEPRICATED - access to GridFS raw data through doc _id
curl -s "https://pswww.slac.stanford.edu/calib_ws/cdb_cxic0415/gridfs/5b6893d91ead141643fe3f6a" - access to GridFS raw data through data _id

Implementation

Source code: https://github.com/slac-lcls/lcls2/tree/master/psana/psana/pscalib/calib see all modules MDB*.py
Command Line Interface (CLI), command cdb: https://github.com/slac-lcls/lcls2/blob/master/psana/psana/pscalib/app/cdb.py
Graphical User Interface (GUI), command calibman: https://github.com/slac-lcls/lcls2/blob/master/psana/psana/graphqt/app/calibman.py
Web-service access interface
- Python: https://github.com/slac-lcls/lcls2/blob/master/psana/psana/pscalib/calib/MDBWebUtils.py
- C++: https://github.com/slac-lcls/lcls2/blob/master/psalg/psalg/calib/MDBWebUtils.hh

Write web access

Code Block

title	2019-07-27 web service to write in DB
collapse	true

 

2019-07-27
Here's version 1; any feedback is appreciated.
Regards,
Murali

#!/usr/bin/env python

"""
Sample for posting to the calibration service using a web service and kerberos authentication.
Make sure we have a kerberos ticket.
"""

import requests
import json
from krtc import KerberosTicket
from urllib.parse import urlparse

ws_url = "https://pswww.slac.stanford.edu/ws-kerb/calib_ws/"
krbheaders = KerberosTicket("HTTP@" + urlparse(ws_url).hostname).getAuthHeaders()

# Create a new document in the collection test_coll in the database test_db.
resp = requests.post(ws_url+"test_db/test_coll/", headers=krbheaders, json={"calib_count": 1})
print(resp.text)
new_id = resp.json()["_id"]

# Update an existing document
resp = requests.put(ws_url+"test_db/test_coll/"+new_id, headers=krbheaders, json={"calib_count": 2})
print(resp.text)

# Delete an existing document
resp = requests.delete(ws_url+"test_db/test_coll/"+new_id, headers=krbheaders)
print(resp.text)

# Create a new GridFS document, we upload an image called small_img.png
files = [("files",  ('small_img.png', open('small_img.png', 'rb'), 'image/png'))]
resp = requests.post(ws_url+"test_db/gridfs/", headers=krbheaders, files=files)
print(resp.text)
new_id = resp.json()["_id"]

# Delete the GridFS document
resp = requests.delete(ws_url+"test_db/gridfs/"+new_id, headers=krbheaders)
print(resp.text)

Summary

MongoDB structure has limitations in number of levels and document size.
- server may have many DBs
- DB is a container for collections
- collection is a group of documents
- document is a JSON/BSON object of key:value pairs (dictionary). Each value may be dictionary itself etc, but further structure levels are not supported by DB structure.
  - document size has hardwired limit 16MB (in 2010 increased from 4 to 16MB and devs do not want to change it). CSPAD 2Mpix*8byte(double) = 16MB, but we may expect larger detectors like Jungfrau, Epix, Andor, etc.
  - Larger data size is suggested to save using GridFS; split data for chanks and save chunks in the same DB in different collections.
  - JSON (text) object in MongoDB is presented in unicode...(UTF-8). Data should be converted to unicode force and back in saving retrieving.
schema-less DB looks interesting to certain extents, but in order to find something in DB there should be a schema...
GridFS works fine with document size>16GB.

...

Page tree

Versions Compared

Old Version 16

New Version Current

Key

Interface from Murali

Implementation

Write web access

Summary

Page tree

Page History

Versions Compared

Old Version 16

New Version Current

Key

Interface from Murali

Implementation

Write web access

Summary