Time & date
12:00 noon 7/9/2018 at SLAC
Attendees:
Bebo White, Umar Kalim, Les Cottrell
Discussion
The idea is to reduce/eliminate the dependence on SLAC.
We identified two possibilities for PingER:
- The PingER Oracle meta database of host coordinates (NODEDETAILS)
- The actual raw measurement data. Typically found on the PingER Measurement Agent (MA) cached under /usr/local/share/pinger/data. This data is gathered on a daily basis (by ping_data.pl) from each active MA by SLAC and archived as /nfs/slac/g/net/pinger/pinger2/data/ping-<YYYY>-<MM>.txt. The data flow is described in PingER data flow at SLAC. It is already publicly available via anonymous FTP.
1). NODEDETAILS
Enabling MAs to update NODEDETAILS independently would enable a richer sharing of both Beacons (kept in <Beacons> and target hosts (kept in <HostList>. Currently, only Beacons are shared. This sharing could be a big advantage to MAs such as SLAC, GZHU and UBRU which have large local <HostList>s. The amount of data in the database relatively small. There are about 3500 hosts in the database including active: ~ 127 Beacons, 40 MAs; 2200 Disabled (no longer active hosts). Each host has about 20 columns of information, each of which is up to 100Bytes long. As envisioned each snapshot would be a complete representation of the database. The database is only updated occasionally, e.g. say once a week on average so the number of snapshots is not large.
Maybe while one is learning about about Blockchains this might be a place to start.
2). Raw measurement data
This is a much large data space. For the current data storage, just from 2016-01 thru 2018-06 there are about 32 GBytes or ~0.4GBytes/month or ~0.012GBytes/day. The data is updated on a daily basis. If each new snapshot is to be complete then this could get huge, e.g. if we keep the snapshots going back only for the most recent 12 months each snapshot is ~12 (months) * 0.4GBytes = 4.8GBytes and there are 365 (days in a year) of them, i.e. ~ 1.8TBytes/participating MA. If on the other hand, each snapshot is just the daily measurement then each snapshot is ~0.012Gbytes. In the latter case, the analysis will need to add all these snapshots together. Some thought will be needed to figure out how to save and access the data.
Questions
- How complicated is setting up a Blockchain
- If the effort is too high then we may not have resources to implement
- Do we have the resources?
- The transition cost could be large
- Do all MAs have to participate,
- this is probably not practical since many MAs have little or no resources for this type of effort/transition.
- If only say Gzhu, Ubru and SLAC MAs participate, is this sufficient redundancy?
- It might simplify the data deployment
- Tieing PingER to Blockchain could increase the interest and resources in the PingER project
- Blockchains are a hot topic today.
- Stanford has created a Blockchain Institute, see https://www.bitrates.com/news/p/stanford-has-announced-their-new-world-class-center-for-blockchain-research
- Many universities are pursuing Blockchain see https://www.accounting-degree.org/college-cryptocurrency-blockchain-courses/
- Get students interested and writing papers
- get access to MS and PhD students
- What is Saqib's situation (Saqib can you weigh in here):
- Duration at Gzhu
- Access to students to work on blockchain for PingER
- Interest in working on Items 1 or 2 (or both) above?
- How to transition from today's centralized on SLAC to a more distributed Blockchain implementation
- Will need to continue current PingER while new Blockchain implementation is being developed, made robust and complete
- Will need web interfaces to the data and new mechanisms
- What about the analysis, presentation?