Page History
...
Code Block | ||||
---|---|---|---|---|
| ||||
mongo --host psanaphi105 --port 27017 To exit the shell, type quit() or use the <Ctrl-C> shortcut. > db test > show dbs admin 0.000GB calib-cxif5315 0.006GB config 0.000GB local 0.000GB > use calib-cxif5315 switched to db calib-cxif5315 > show collections cspad-0-cxids1-0 cspad-1 > db["cspad-0-cxids1-0"].find() > db["cspad-0-cxids1-0"].find().pretty() # Delete databale: use calib-cxif5315 db.dropDatabase() # Delete collection db.collection.drop() # OR: db["cspad-0-cxids1-0"].drop() > help # Export/backup database in file > help mongodump -d <dbname> --archive <filename> --out /path/to/backup/dir # Import database from file mongorestore -d <dbname> --archive <filename> |
Connection to DB in python
...
Code Block | ||||
---|---|---|---|---|
| ||||
# Model #1: DB per detector type, collection per detector: ----------------------------------------------------------- dbdet = client('db-cspad') # Collections: col1 = dbdet['cspad-0-cxids1-0'] col2 = dbdet['cspad-0-cxids2-0'] col3 = dbdet[' 'cspad-0-cxidsd-0'] col4 = dbdet['cspad-0-xcsendstation-0'] col5 = dbdet['cspad-0-xppgon-0'] col6 = dbdet['cspad-0-sxrbeamline-1'] col7 = dbdet['cspad-0-mectargetchamber-0'] # Document content for dbdet doc the =same { as dbexp plus "id_id":ObjectId("data" doc = {..."), "refid_iddata": ObjectId("534009e4d852427820000002"), etc... } # Model #2: DB per detector, one collection per detector: --------------------------------------------------------- dbdet = client('db-cspad-0-cxids1-0') col = dbdet['cspad-0-cxids1-0'] # Add collections in case of DB copy 'fs.files' 'fs.chunks' |
...
Data extraction time: 96ms. Thus returned array is "flattend" and needs to be shaped.
Summary
Interface from Murali
2018-08-03 e-mail from Murali:
I have installed Mongo 4.0 on psdb-dev. I was hoping to use their REST service but this seems to have been deprecated and eliminated since 3.6.
So, I knocked a quick web service and have proxied it from pswww. This web service (https://github.com/slaclab/psdm_mongo_ws) is a suggestion only; please let me know if you need something different.
These are examples of getting data over HTTPS from a batch node from within cori; needless to say, the URL prefix is https://pswww.slac.stanford.edu/calib_ws
Two users:
- mongo --host=psdb-dev --port 9306 -u "dubrovin" -p "...." --authenticationDatabase "admin"
- mongo --host=psdb-dev --port 9306 -u "calibuser" -p "...." --authenticationDatabase "admin"
Test commands:
- curl -s "https://pswww.slac.stanford.edu/calib_ws/test_db/test_coll/5b649a9df59ae00bda110168"
- curl -s "https://pswww.slac.stanford.edu/calib_ws/test_db/test_coll"
- curl -s "https://pswww.slac.stanford.edu/calib_ws/test_db/test_coll?item=planner&size.uom=cm"
- curl -s "https://pswww.slac.stanford.edu/calib_ws/test_db/test_coll?query_string=%7B%20%22item%22%3A%20%22planner%22%2C%20%22qty%22%3A%2075%20%7D%0A"
- curl -s "https://pswww.slac.stanford.edu/calib_ws/" - get string of databases
- curl -s "https://pswww.slac.stanford.edu/calib_ws/test_db" - get list of collections in database
- curl -s "https://pswww.slac.stanford.edu/calib_ws/cdb_cxic0415/cspad_detnum1234?ctype=pedestals&data_size=2296960&run=74" - find and return document for query
- curl -s "https://pswww.slac.stanford.edu/calib_ws/cdb_cxic0415/cspad_detnum1234/gridfs/5b6893e81ead141643fe4344" - get document with constants from GridFS using document id
- curl -s "https://pswww.slac.stanford.edu/calib_ws/cdb_cxic0415/cspad_detnum1234/gridfs/5b6893e81ead141643fe4344" - DEPRICATED - access to GridFS raw data through doc _id
- curl -s "https://pswww.slac.stanford.edu/calib_ws/cdb_cxic0415/gridfs/5b6893d91ead141643fe3f6a" - access to GridFS raw data through data _id
Implementation
- Source code: https://github.com/slac-lcls/lcls2/tree/master/psana/psana/pscalib/calib see all modules
MDB*.py
- Command Line Interface (CLI), command
cdb
: https://github.com/slac-lcls/lcls2/blob/master/psana/psana/pscalib/app/cdb.py - Graphical User Interface (GUI), command
calibman
: https://github.com/slac-lcls/lcls2/blob/master/psana/psana/graphqt/app/calibman.py - Web-service access interface
Write web access
Code Block | ||||
---|---|---|---|---|
| ||||
2019-07-27
Here's version 1; any feedback is appreciated.
Regards,
Murali
#!/usr/bin/env python
"""
Sample for posting to the calibration service using a web service and kerberos authentication.
Make sure we have a kerberos ticket.
"""
import requests
import json
from krtc import KerberosTicket
from urllib.parse import urlparse
ws_url = "https://pswww.slac.stanford.edu/ws-kerb/calib_ws/"
krbheaders = KerberosTicket("HTTP@" + urlparse(ws_url).hostname).getAuthHeaders()
# Create a new document in the collection test_coll in the database test_db.
resp = requests.post(ws_url+"test_db/test_coll/", headers=krbheaders, json={"calib_count": 1})
print(resp.text)
new_id = resp.json()["_id"]
# Update an existing document
resp = requests.put(ws_url+"test_db/test_coll/"+new_id, headers=krbheaders, json={"calib_count": 2})
print(resp.text)
# Delete an existing document
resp = requests.delete(ws_url+"test_db/test_coll/"+new_id, headers=krbheaders)
print(resp.text)
# Create a new GridFS document, we upload an image called small_img.png
files = [("files", ('small_img.png', open('small_img.png', 'rb'), 'image/png'))]
resp = requests.post(ws_url+"test_db/gridfs/", headers=krbheaders, files=files)
print(resp.text)
new_id = resp.json()["_id"]
# Delete the GridFS document
resp = requests.delete(ws_url+"test_db/gridfs/"+new_id, headers=krbheaders)
print(resp.text) |
Summary
- MongoDB structure has limitations in MongoDB structure has limitations in number of levels and document size.
- server may have many DBs
- DB is a container for collections
- collection is a group of documents
- document is a JSON/BSON object of key:value pairs (dictionary). Each value may be dictionary itself etc, but further structure levels are not supported by DB structure.
- document size has hardwired limit 16MB (in 2010 increased from 4 to 16MB and devs do not want to change it). CSPAD 2Mpix*8byte(double) = 16MB, but we may expect larger detectors like Jungfrau, Epix, Andor, etc.
- Larger data size is suggested to save using GridFS; split data for chanks and save chunks in the same DB in different collections.
- JSON (text) object in MongoDB is presented in unicode...(UTF-8). Data should be converted to unicode force and back in saving retrieving.
- schema-less DB looks interesting to certain extents, but in order to find something in DB there should be a schema...
- GridFS works fine with document size>16GB.
References
- https://docs.mongodb.com/manual/tutorial/install-mongodb-on-linux/http://api.mongodb.com/python/current/tutorial.html
- tutorial
- recover-data-following-unexpected-shutdown
- authorization
Overview
Content Tools