With PingER going back over a decade, its presentation tools tend to be a bit jaded having in some cases been developed in the 90's. New, modern ways to access display and navigate the data would be a big plus. We propose two projects below. If these are of interest, Bebo and Les are willing to work with students who come to SLAC and remotely with students in India. Both of these areas have great potential for papers within the conference communities with which Bebo is directly involved.

  1. Creative visualization of PingER data including rich interaction.
  2. Publication of PingER data in a database queryable format. This is a follow on to the PingER Warehouse project of Thiago. Besides providing user access tools via the web, it will also require  automated synchronizing the new Hadoop based warehouse with the currently text based file system.

Skills: HDFS, perl or python, RESTful interfaces, databases

Mentors: Les, Bebo. Thiago, Maria

References: 

  • Home page for PingER: http://www-iepm.slac.stanford.edu/pinger/
  • IEEE paper on PingER: http://www-iepm.slac.stanford.edu/paperwork/ieee/ieee.pdf
  • Paper on "Linked Open Data Publication Process: Application in Networking Performance Measurement Data"  https://confluence.slac.stanford.edu/download/attachments/17164/P%C3%ACngER%20LOD%20Stanford%20Conference%20Paper_v1%2001a.docx
  • Applying Data Warehousing and Big Data Techniques to Analyze Internet Performance, to be published Abstract:
    • Measuring the quality of Internet is essential to evaluate the performance of data links around the world and to keep track of how countries have improved their connections throughout the years. Moreover, Internet performance measurements provide understanding for network bottlenecks, trouble-shooting and even insights about the impact of major events such as tsunamis, fiber cuts or social upheavals. For this reason, since 1998, the PingER (Ping End-to-end Reporting) initiative at SLAC National Accelerator Laboratory monitors end-to-end performance of Internet links spread over 160 countries, providing a worldwide history of Internet performance. Data containing network measurements are daily collected from PingER Measurement Agents (MAs) and stored into flat files. As a result, PingER maintains a valuable fine-grained big dataset consisting of Internet performance data around the world. However, due to the large amounts of data, performing sophisticated joint analyses on those files may be so difficult that it becomes unfeasible in some scenarios. In this paper, we apply data warehousing techniques to transform the data on those flat files into structured data using a data model that facilitates complex analyses. We load the transformed data into a big distributed data warehouse that is able to perform complex analytical queries on large volumes of data in seconds. Finally, we show some data analyses correlating Internet performance data to hypothetical real-world scenarios.
  • No labels