Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Anjum reports (6/23/2014) that "the proposal for conference has been submitted for approval and Pinger has been added in the agenda. Travel expenses for Les and Bebo have also been included in the conference proposal. We are awaiting the proposal approval. 8/16/2014: The faculty management at UM just changed and many matters required urgent attention. Abdullah will be able to update us soon once he gets a chance to see the Vice Chancellor. Once the approval is given the venue for the conference can be at UM or UUM.


    As discussed earlier, the only twist here is that Pinger will be seen as a case study for big data. This is good in a sense that people interested in doing research in the domain of big data can deploy pinger monitoring nodes at their respective universities/organizations and in return, play around with the data. We agreed that it looked like the 25th would be a good day for the PingER workshop. Les should be able to make it from Burkina Faso, and Bebo should be able to get back to the US for Thanksgiving.  There would be back to back presentation on how PingER gathers, archives data, what data there is, the data types, how to access etc.  by Les followed by Bebo on Google Tools for Big data.

  • Anjum suggested putting together a paper on metrics provided by PingER for Sigmetrix. The due date is in November.

Renan

Les met with Renan and his superviser (Maria Luiza Campos). The minutes are at: https://confluence.slac.stanford.edu/display/IEPM/20140703+Meeting+between+UFRJ+and+SLAC

Luiza has set up a small project in the UFRJ Reference center to provide big data analysis/mining of PingER multidimensional data

Luiza has proposed three approaches to provide big data analysis/mining of PingER multidimensional data:

  1. Conventional. Utilization of Pentaho environment to handle big multidimensional data, which enables utilization of enhanced user interfaces.
  2. Linked Data. Benchmarking of more sophisticated Triple Stores than the one we use today at PingER LOD (Sesame). Preferably, we should analyze parallel and distributed solutions. CumulusRDF is an example.  
  3. Utilization  Utilization of Greenplum (http://en.wikipedia.org/wiki/Greenplum). This is an intensive high performance database from EMC with many features such as caching. It is partly from the EMC acquisition of Pivotal. There is also a DBMS called Grindplan that explores lots of features using Pivotal.
    1. Renan is investigating an alternative to Hadoop, which utilizes a Scientific Workflow Management System and makes use of Map/Reduce paradigm to help both querying and provenance of the Linked Data (RDF) data.
    2. Ibrahim is investigating an approach that utilizes Hadoop Map/Reduce in a Key/Value store with PingER data in RDF.

Following the last meeting Les will make made available via FTP examples of PingER data. There are two types:

  1. Raw data as gathered daily from all the monitoring hosts. This data is ie measured at 30 minute intervals and is quite dirty.
  2. Analyzed data by metric. This has been cleaned up. Les recommends UFRJ uses the cleaned up data., 

The instructions for the data will  were also be sent to Luiza.  Also see PingER data flow at SLAC.Les will also send Luiza information sent as well as size estimates and information on how PingER data has been used.

Maria and Renan are advancing in some approaches to deal with PingER data, making it easier to be analyzed and integrated. They are planning to meet up this week to discuss some updates and achievements so far.

UM

The ping server at http://pinger.fsktm.um.edu.my/cgi-bin/traceroute.pl?target=www.slac.stanford.edu&function=ping gives ping server busy at the moment. Please try again later. Some one with access to the web servers should look at that (e.g. review the web logs). Maybe it is being hit with a lot of requests simultaneously. If they are coming from SLAC we may want to look at reflector.cgi.

...