Life of a Partition
- collection manager (CM) alive (long lived)
- restart .cnf
- python-proxies comes up
- proxies reports existence/identity (Detector/DRP, EB, monshmserverAMI, monshmserverPsana, DTI, XPM, Control) to CM
- control level queries CM for list of all processes in the platform
- control level notifies CM of processes it wants for partition
- CM tells all processes about all the nodes in the partition
- processes figure out how many "ports" (an abstract idea, could be q-pair for IB) they need given the nodes in the partition and report back their ports to CM
- CM treats ports as opaque information
- CM also manages connection information for detector-to-dti-link-to-drp-node mapping table (connection information) which is used by the DTI-proxy to compute a dti-link-mask.
- after gathering all port information, CM broadcasts all ports, as well as CM-assigned "id" to proxies
- proxies are specialized for particular levels (DRP, EB, DTI) and select ports they are interested in
- proxies make appropriate connections based on their CM-assigned ID number
- proxies report that they are connected, or failure (e.g. if DTI-link-mask is already allocated, or IB connection fails)
- after all proxies report in, manager notifies control level that transitions can be sent
- control level starts to send transitions
- graceful teardown: unconfigure/unmap transitions complete. CM sends disconnect message to proxies
- ungraceful teardown: feels like we need to restart, since things can be broken in various ways
Partition vs. Platform
- one .cnf file corresponds to one platform
- a partition is a subset of the detectors in a platform
- there cannot be two partitions per platform: instead create a second platform with an additional .cnf
Contents of .cnf file
- collection related
- EB use IB or IP (goes to Ric)
- ip of CM (determines network)
- (maybe) map of detector-to-dti-link-to-drp-node-pgp-lane
- monshmserver group IDs (e.g. psana/AMI)
- non-collection
- static detector discovery? (dynamic feels difficult)
- outfile paths
- configdb info
- shmem names
- drp algs (e.g. ROI)
Design Decisions
- one CM per .cnf (long-lived process)
- does collection control the BOS? We believe not. Doesn't change frequently
- does monshm participate in collection? yes
- how flexible are DRP exe's: ideally multiple dlopen's, so long-lived, but feels unlikely given constraints
- more seamless restarts? on-the-fly restarts feel very difficult. would complicate the code significantly. better to keep code simple. AMI can be restarted independently.
- use zmq select to timeout
- use python-proxies for C++. Communicate python-created datagrams to C++ via zmq.
Issues
- Will procmgr scale? Can we reduce restart time?
- Does map of detector-to-dti-link-to-drp-node-pgp-lanes live in cnf or database?
- How do we manage the BOS?
Working Meeting (5/18/18)
First-pass Collection Demo from Chris Ford
https://confluence.slac.stanford.edu/display/~caf/June+19+Demo
Automation of DRP/EB Connection Meeting Notes (6/21/18)
- cnf starts dab/eb ports (everyone gets CM well-known ipaddr/port via cnf. port corresponds to platform number)
- (needs work) drp/eb nodes find their own IB addr's
- user asks for partition, sends out PLAT msg (says "who exists?")
- all drp/eb nodes reply with their name/hostname
- CM sends ALLOC message to nodes selected for partition, e.g. 4 drp, 2 db. (says "who is participating in partition")
- eb/drp open up appropriate ephemeral ports (possibly depending on who else is in partition) and report them to the CM (response to ALLOC)
- CM sends CONNECT Messages to eb/drp nodes with assigned pseudo-random ID numbers and connection information. ID numbers could also be communicated with ALLOC message.
- eb/drp nodes connect
- (needs work) need to add CONNECT_COMPLETE msg to send to CM
- eb/drp/control.py do their usual mainloop
DTI Collection Information
Collection behavior of DTI (should be implemented in the as-yet-to-be-named "blob" process).
On PLAT:
- epics prefixes
- available upstream/downstream ports
On CONNECT:
- alloc ports and connect upstream/downstream
Some details. The epics prefixes are:
- "where" (DAQ:LAB2)
- XPM:crate#
- DTI:crate:slot
DTI port info (UDET devices, e.g. Jungfrau):
- crate/slot/port/device (device auto-detected with PGP)
XPM port info (TDET devices, e.g. hsd, tool, tes)
- crate/port/device
Issues:
- need upstream/downstream ids for detectors/KCUs
- downstream ids are 32-bit and should communicate padder, firmware type, card#
- multiple detector auto detect in 1 TDET-style KCU doesn't work trivially
- KCU firmware is detector specific
- TTOOL/TES don't currently have device detection
Overview
Content Tools