Time & date 

Wednesday  Aug 20th,  2014 9:00pm Pacific Standard Time, Thursday Aug 21st 2014 9:00am Pakistan time, Thursday Aug 21st, 2014 12:00 noon Malaysian time, Thursday Aug 21st, 2014 1:00am Rio Standard Time.

Attendees

Invitees:

Anjum-, Hassaan Khaliq, Kashif+, Raja+,  Johari+, Nara, Adnan+, Abdullah, Badrul, Ridzuan, Ibrahim+, Hanan, Saqib+, Adib-, Les+, Renan, Bebo+

+ Confirmed attendance

- Responded but  Unable to attend: 

Actual attendees:

Kashif, Raja, Johari, Adnan, Les, Bebo

Administration

  • The connectivity was dreadful, lots of noise, restarted several times.  Maybe we should try Google hangouts.

  • Anjum reports (6/23/2014) that "the proposal for conference has been submitted for approval and Pinger has been added in the agenda. Travel expenses for Les and Bebo have also been included in the conference proposal. We are awaiting the proposal approval. 8/16/2014: The faculty management at UM just changed and many matters required urgent attention. Abdullah will be able to update us soon once he gets a chance to see the Vice Chancellor. Once the approval is given the venue for the conference can be at UM or UUM.


    As discussed earlier, the only twist here is that Pinger will be seen as a case study for big data. This is good in a sense that people interested in doing research in the domain of big data can deploy pinger monitoring nodes at their respective universities/organizations and in return, play around with the data. We agreed that it looked like the 25th would be a good day for the PingER workshop. Les should be able to make it from Burkina Faso, and Bebo should be able to get back to the US for Thanksgiving.  There would be back to back presentation on how PingER gathers, archives data, what data there is, the data types, how to access etc.  by Les followed by Bebo on Google Tools for Big data.

  • Anjum suggested putting together a paper on metrics provided by PingER for Sigmetrix. The due date is in November.

Renan

Following the last meeting Les made available via FTP examples of PingER data. There are two types:

  1. Raw data as gathered daily from all the monitoring hosts. This data is ie measured at 30 minute intervals and is quite dirty.
  2. Analyzed data by metric. This has been cleaned up. Les recommends UFRJ uses the cleaned up data., 

The instructions for the data were also sent as well as size estimates and information on how PingER data has been used.

Maria and Renan are advancing in some approaches to deal with PingER data, making it easier to be analyzed and integrated. In particular they have been busy studying and evaluating alternatives, analyzing results from the latest benchmarks on NoSQL (including RDF and graph based storages) database management, distributed processing and mediated  solutions over relational databases, and also other experiments with multidimensional analyses on Linked Data.  The new students involved are now understanding better the scenario and they have been interacting with Renan regularly. 

They have separated the tasks into 2: 

  1. Quantitative analysis on PingER data
    1. They want to know how PingER has grown, since 1998 until today and how it might be in the next years. By doing this, we may focus on more suitable technologies that deal with scenarios that have a similar profile with PingER.
      1. Two students are working on this.
  2. Approaches to handle PingER current data
    1. Conventional approach – Utilization of Cassandra as back-end database to provide easy crossing of parameters to get PingER data.
      1. One student is working on this.
    2. Distributed and parallel approach – Utilization of a data warehouse on top of a distributed file system to provide low latency response to complex queries (like the ones we were not able to do on my previous work). Additionally, how Scientific Workflow Management Systems may help in the ETL process of transforming PingER so it can easily be stored on the data warehouse.
      1. Renan is working on this.
    3. Pure RDF approach – Good ways of modeling and natively storing RDF data.
      1. Maria-Luiza is working on this.
    4. NoSQL approaches – How other NoSQL DBMS may be adequate for PingER multidimensional data.
      1. Two students are evaluating existing NoSQL solutions for multidimensional scenarios (such as PingER)
    5. Key-Value storages for PingER data in RDF
      1. This is Ibrahim’s work.

In the end, they want to compare all these approaches.

UM

The ping server at http://pinger.fsktm.um.edu.my/cgi-bin/traceroute.pl?target=www.slac.stanford.edu&function=ping gives ping server busy at the moment. Please try again later. Some one with access to the web servers should look at that (e.g. review the web logs). Maybe it is being hit with a lot of requests simultaneously. If they are coming from SLAC we may want to look at reflector.cgi.

Badrul (6/23/2014) is still awaiting hearing from his student (Abdulrahim Haroun Ali who is out of the country) on  the paper on anomalies in PingER measurements  and will update later once the paper ready. For the minute the paper is not ready. No update 8/20/2014.

Ridzuan has put together a rough proposal to use Hadoop to store and make available PingER data.  He has registered for the Myren cloud services last month. But until now still not getting any approval for the use of the mentioned services. Will follow up again with them. For the Hadoop implementation, He is  considering the use of Hortonworks Hadoop Data (HDP2) platform, however there are some problems with the latest installation because UM adopted IPV6. Most of the HDP2 repositories are resided in IPV4 server thus make it difficult to correctly install to our server. He is trying to use another platform or find a way to solve this installation problem. No update 8/20/2014.

Ibrahim Abaker  is planning to work on a topic initially entitled " leveraging pingER big data with a modified pingtable for event-correlation and clustering".  Ibrahim has a proposal, see https://confluence.slac.stanford.edu/download/attachments/17162/leveraging+pingER+big+data+with+a+modified+pingtable+for+event-correlation+and+clustering.docx. Ibrahim reports 7/15/2014 "I have spent the last few months trying to understand the concept of big data storage and its retrieval as well as the traditional approach of storing RDF data. I have integrated a single hadoop cluster in our cloud. but for this project we need multiple clusters, which I have already discussed with Dr. Badrul and he will provide me with big storage for the experiment." No Update 8/20/2014.

"I have come up with initial proposed solution model. This model consists of several parts. The upper parts of the Figure below shows the data source, in which PingER data will be convert into RDF format. Then the data pre-processor will take care of converting RDF/XML into N-triples serialization formats using N-triples convertor module. This N-triple file of an RDF graph will be as an input and stores the triples in storage as a key value pair using MapReduce jobs"

Les forwarded by email the information from Ibrahim to Renan following the meeting

UNIMAS

Pinger 2 (Raspberry Pi) is working with ping server, making PingER measurements  and gathering data all successful. A next step will be to see if it is reliable and if there are significant differences between it and the pinger host at UNIMAS. 

The tool to enable synchronizing Malaysian sites: added request from Saqib to sort the sites by country. Also have added another page to view statistic of sites by country. Have completed Troubleshoot and solve issues with form when inserting and updating record. The new page is available from the following page (two links on top of the table)                http://pinger.unimas.my/pinger/sites.php

Traceroute server: Status unsolved. The problem is the same on Pinger2. Johari talked to the network administrator at the centre about this issues and he suggested to talk to the security manager to check whether the firewall is blocking the icmp packer from the traceroute command (to do list)

Custom iso: He can get as far as the boot screen, but is unable to get to the desktop. They will start work on it next week.

Johari has created a shell script to automate the installation of pinger package in Ubuntu/Fedora/Centos Linux distro.  The finalized installation script for pinger software for CentOS/Fedora based distribution is available from the following url (past the step by step tutorial): http://pinger.unimas.my/pinger/install-tutorial.php. Kashif assisted in testing the script.

Research: one student currently doing master by research on pinger project. Progress is a bit slow since the student lacks sound technical and programming skill to implement potential solution. Also will supervise another Advanced Project (Master by coursework) this coming sep 2014. Planning to investigate whether two pinger monitoring host has any differences in term of data collection (pinger and pinger2 nodes in UNIMAS

UTM 

Saqib has talked to MYREN they say the routers at hops 5 and 6 in the traceroute from UM to UNIMAS are both at UM and the long delay between them is due to congestion. Les is skeptical since hops 6-12 have similar RTT and 12 is near Kuching. Les suggested Saqib run mtr for a day or more from UM to UNIMAS see if there is any day night variation in RTT. If min RTT gets down to <2 ms, then the MYREN guy is right. If it is ~ 50ms  and persists for several days then it really does not look like congestion (which should vary day night as the number of users changes). In that case then it really appears hop 6 is physically close to hop 12 since they have the same min RTT. Taken together with the email from UM I would be very suspicious of the MYREN statement. Is there an update?

Saqib has updated and re-submitted  his proposal to FRGS. Saqib sent a copy to Anum and Les. Anjum is going to help edit the proposal.

Saqib also reports there is a change in traceroute from SLAC to UNIMAS in this month. Previously, it goes via TEIN3. Further, it did not touch the MYREN network. However, the traceroute from SLAC to UTM and UM remains the same. Johari is going to look at.

Les did a binary search (using http://www-wanmon.slac.stanford.edu/cgi-wrap/traceroutearchive.cgi?from=www-wanmon.slac.stanford.edu&to=pinger.unimas.my&date1=2014_06_17&date2=2014_06_18&date3=2014_06_19). It appears the change happened on Wed 2014_06_18 (there is actually no trace route measured that day) . Les has no idea of the significance, Saqib may need to check with his contacts in Malaysia.  Traceroute from UM to SLAC:

traceroute to 134.79.197.200 (134.79.197.200), 30 hops max, 140 byte packets
 1  ip254.fsktm.um.edu.my (202.185.107.254)  0.762 ms  0.668 ms  0.588 ms
 2  10.94.253.249 (10.94.253.249)  1.147 ms  1.080 ms  0.963 ms
 3  172.20.2.254 (172.20.2.254)  0.723 ms  0.602 ms  0.531 ms
 4  161.142.24.129 (161.142.24.129)  1.263 ms  1.081 ms  1.009 ms
 5  161.142.5.249 (161.142.5.249)  4.149 ms  4.222 ms  4.146 ms
 6  ix-10-3-4-2011.tcore1.HK2-Hong-Kong.as6453.net (180.87.112.97)  36.242 ms  36.169 ms  36.461 ms
 7  if-3-2.tcore1.TV2-Tokyo.as6453.net (180.87.112.6)  191.692 ms  190.497 ms  204.063 ms
 8  if-9-2.tcore2.PDI-Palo-Alto.as6453.net (180.87.180.17)  202.157 ms  201.441 ms  201.804 ms
 9  if-5-2.tcore2.SQN-San-Jose.as6453.net (64.86.21.1)  189.191 ms  189.030 ms  190.501 ms
10  eqx-sj-tata.es.net (198.129.44.53)  198.634 ms  198.559 ms  198.415 ms
11  * * *
12  slacmr2-ip-b-sunncr5.es.net (134.55.40.14)  196.457 ms  197.480 ms  196.380 ms
13  rtr-border1-p2p-slac-mr2.slac.stanford.edu (192.68.191.246)  198.314 ms  195.649 ms  197.740 ms
14  * * *

The route from SLAC to UNIMAS before (20140617) is:

1 134.79.197.131 (134.79.197.131) [AS3671] 0.435 ms
2 rtr-core2-p2p-serv01-02.slac.stanford.edu (134.79.254.61) [AS3671] 0.344 ms
3 rtr-border1-p2p-core1.slac.stanford.edu (134.79.252.133) [AS3671] 0.406 ms
4 slac-mr2-p2p-rtr-border1.slac.stanford.edu (192.68.191.245) [AS3671/AS38621] 30.412 ms
5 *
6 transpac-1-is-jmb-780.lsanca.pacificwave.net (207.231.246.136) [*] 8.672 ms
7 tokyo-losa-tp2.transpac.org (192.203.116.146) [*] 124.037 ms
8 kote-dc-gm1-xe2-2-1-4005.jp.apan.net (203.181.248.249) [AS7660] 125.611 ms
9 sg-xe-01-v4.bb.tein3.net (202.179.249.77) [AS24489] 192.946 ms
10 my-pr-v4.bb.tein3.net (202.179.249.70) [AS24489] 200.499 ms
11 (203.80.23.61) [AS24514/AS4788] 242.756 ms
12 (203.80.22.158) [AS24514/AS4788] 243.353 ms
13 (203.80.23.94) [AS24514/AS4788] 241.138 ms
14 (203.80.18.182) [AS24514/AS4788] 241.086 ms

The route after (20140618) is 

1 rtr-servcore1-serv01-webserv.slac.stanford.edu (134.79.197.130) [AS3671] 0.423 ms
2 rtr-core2-p2p-serv01-01.slac.stanford.edu (134.79.254.65) [AS3671] 0.293 ms
3 rtr-border1-p2p-core2.slac.stanford.edu (134.79.252.137) [AS3671] 0.443 ms
4 slac-mr2-p2p-rtr-border1.slac.stanford.edu (192.68.191.245) [AS3671/AS38621] 0.232 ms
5 *
6 eqxsjrt1-ip-a-sunncr5.es.net (134.55.38.146) [AS293] 24.495 ms
7 sj-igw01.tm.net.my (206.223.116.120) [AS4637] 2.668 ms
8 *
9 58.26.240.62 (58.26.240.62) [AS4788] 215.915 ms

BTW you can examine the geography of the above routes by cutting and pasting them into http://www-wanmon.slac.stanford.edu/cgi-wrap/reflector.cgi?function=vtrace

UUM

Adib reports they completed the installation of the PingER server at UUM. However, the second hand (old) machine is not working properly :(, suddenly shut down/restart!

NUST

Installation is in progress for the Bahawalpur site. Install complete needs approval from head.

The following are now up and running:

  • buitms.seecs.edu.pk (electrical problem appears solved, pk, working(load shedding problem is much better now). The time of server was also incorrect(almost 24 hrs difference) which has been corrected now. ).

Kashif is working on:

  • pinger.kohat.edu.pk, Still, trying to find motherboard of Dell Optiplex 760. System, is old, hard to find motherboard, hope to solve soon. Currently the name is not resolving

  • also host pinger.nwfpuet.edu.pk name is working now.

  • there was no data this month from pingerfsbd.pern.edu.pk and the host was unpingable from SLAC, it is working now

  • sau.seecs.edu.pk was ungatherable (and unpingable) since Aug 7th, prior to that it was unreliable, it is working now

  • www.upesh.edu.pk Problem has been resolved by replacing new files and by giving full rights. Ping to is OK, data is also being collected now, however ping from is giving permission denied error.

Raja

Raja is back in Pakistan.

PingER at SLAC

Les requested an update from Yahoo about TULIP's geolocation. They answered "We are very much interested in getting IP triangulation at internet scale, we will have internal sync-up on how we can leverage this initiative if there is rate limit and get back. Regarding opening up yahoo sites for deploying ping server requires some more time to discuss this with relevant stake holders with in yahoo." No word, sent a reminder 5/19/2014. No response 6/4/2014, 6/25/2014.. 

Les sent email to Google as follows: "I would like to bring to your attention that we have developed a geolocation tool using delay based (using RTTs from known ping server landmarks) distance estimates to triangulate the location of an IP host target. The tool is accessible at: http://www-wanmon.slac.stanford.edu/cgi-wrap/reflex.cgi. We have identified that the accuracy of the geolocation is directly related to the landmark density (e.g. # of landmarks/ million sq km). The higher the density the smaller the error and the fall off is exponential. We currently have over 1000 registered landmarks, of which at any given time ~300 are working. The tool not only finds the location of the target, it also gives an estimated error. To the best of our knowledge it is the only freely available delay based measurement geolocation service publicly available today. A drawback (compared to database methods such as those based on GeoMind) is the time taken to make the measurements. We have worked on this from many directions including parallelization of the ping requests, caching, tiering to get the rough location (i.e. region of the world) then zooming in using all landmarks in the region. We are putting together a publication on this." Les sent an update to his contact at Google 6/23/2014, stressing the applicability to traceroute visualization. No response 7/13/2014.

Old Items

Linked Open Data

Renan  finished the new pingerlod web site. The new thing is that it should be much easier now to modify the info texts. What Renan did was to put the texts into a separate file. The new version has been loaded on the server and some text added to describe how to use the map. However there is a bug that prevents it from executing the map. Renan reports that the bugs should be easy to fix. He has talked to his professor who suggested trying RDF Owlink, it should have faster responses to queries. Renan will research this.  It will probably mean reloading the PingER data so is a lot of work, hopefully this will improve performance. Before the rebuild he will make the fixes and provide a new WAR for us to load on pingerlod.slac.stanford.edu. He is also working on documentation (he has finished the ontology and has a nice interactive tool for visualizing it, since the ontology is the core of the data model of our semantic solution, this will be very helpful for anyone who uses our system, both a developer of the system and a possible user) and his thesis. Bebo pointed out that to get publicity and for people to know about the data, we will need to add pingerlod to lod.org.

Things he will soon do regarding documentation:

  1. A task/process flow writing all java classes involved on all those batch jobs;
  2. A Javadoc <http://www.oracle.com/technetwork/java/javase/documentation/index-jsp-135444.html> which will explain all classes and how they are used.

For the Linked Open Data / RDF which is in pre-alpha days, you can go to http://pingerlod.slac.stanford.edu. As can be seen this page is not ready for prime time. However the demos work as long as one carefully elects what to look at:

  • Click on Visualizations, there are two choices:
    • Multiple Network Metrics: Click on the image: gives a form, choose from Node pinger.slac.stanford.edu pinging to www.ihep.ac.cn, time parameters yearly, 2006 2012, metrics throughput, Average RTT Packet loss and display format Plot graph, then click on submit. In a few seconds time series graph should come up. Mouse over to see details of values at each x value (year).
    • A mashup of network metrics x university metrics Click on image: gives another form, pinging from pinger.slac.stanford.edu, School metric number of students, time metric years 2006 2012, display format plot graph, click on submit. Longer wait, after about 35 seconds a google map should show up. Click on "Click for help." Area of dots = number of students, darkness of dots = throughput (lighter is better), inscribing circle color gives university type (public, private etc.) Click on circle for information on university etc.
  • Renan will be working on providing documentation on the programs, in particular the install guide for the repository and web site etc. This will assist the person who takes this over. 

Renan is using OWLIM as RDF Repository. He is using an evaluation version right now. Renan looked into the price for OWLIM (that excellent RDF Database Management System he told us about). It would cost 1200EUR minimum  (~ 1620 USD, according to Google's rate for today) for a one time eternal license. It seems too expensive. No wonder it is so good. Anyhow, he heard about a different free alternative. Just not sure how good it would be for our PingER data. He will try it out and evaluate. He will also get a new evaluation of the free OWLIM lite.  

He has also made some modifications on the ontology of the project (under supervision of his professor in Rio) hence he  will have to modify the code to load the data accordingly.

Renan has provided a 4 page Appendix on PingERLOD to the ICFA report.  This is also available at PingER LOD Overview

Raspberry Pi

A quick comparison of the performance of the two hosts (raspberry pi and regular UNIMAS host) without statistical quantification is available at https://confluence.slac.stanford.edu/display/IEPM/Comparison+of+PinGER+RTTs+from+UNIMAS+monitors+N4+and+RASPBERRY.  A page has been created to compare the hardware spec between the pinger.unimas.my node (Intel architecture) and the pinger2.unimas.my node (Raspberry Pi ARM architecture), available from the unimas pinger website at http://pinger.unimas.my/pinger/hardware.php. There is a link to hardware.php in the Comparison+of+PinGER+RTTs+from+UNIMAS+monitors+N4+and+RASPBERRY web page.

NUST

At the Connect Asia Pacific Summit in Bangkok in  January and seeing the  project "Mapping the pan Asia Pacific information Superhighway and closing gaps in infrastructure  connectivity" Shahryar found that very much related to the work in the PingER project. So Shahryar sent email to a UN agency for a possible collaboration with them on PingER project. He has heard nothing so he will write a detailed proposal and then should contact them again. No update 2/5/2014, 3/5/2014.

Tulip
Follow up from workshop
  • Hossein Javedani of UTM is interested in anomalous event detection with PingER data. Information on this is available at https://confluence.slac.stanford.edu/display/IEPM/Event+Detection. We have sent him a couple of papers and how to access the PingER data. Hossein and Badrul have been put in contact. Is there an update Badrul?

The Next step in funding is to go for bigger research funding, such as LRGS or eScience. Such proposals must lead to publications in high quality journals. They will need an infrastructure such as the one we are building. We can use the upcoming workshop (1 specific session) to brainstorm and come up with such proposal. We need to do some groundwork before that as well. Johari will take the lead in putting together 1/2 page descriptions of the potential research projects. 

  1. Need to identify a few key areas of research related to PingER Malaysia Initiative and this can be shared/publicized through the website. These might include using the infrastructure and data for: anomaly detection; correlation of performance across multiple routes; and for GeoLocation. Future projects as Les listed in Confluence herehttps://confluence.slac.stanford.edu/display/IEPM/Future+Projects can also be a good start and also Bebo's suggestion. 
  2. Need to synchronize and share research proposals so as not to duplicate research works. how to share? Maybe not through the website, or maybe can create a member only section of the website to share sensitive data such as research proposal?

Anjum suggested Saqib,  Badrul and Johari put together a paper on user experiences with using the Internet in Malaysia as seen from Malaysian universities. In particular round trip time, losses, jitter, reliability, routing/peering, in particular anomalies, and the impact on VoIP, throughput etc.  It would be good to engage someone from MYREN.

Potential projects

See list of Projects

Future meeting  - Les

Next meeting Wednesday September 17th  2014 9:00pm Pacific Standard Time, Thursday September 18th 2014 9:00am Pakistan time, Thursday September 18th, 2014 noon Malaysian time, Thursday  September 18th, 2014 01:00am Rio Standard Time.

Coordinates of team members:

See: http://pinger.unimas.my/pinger/contact.php

  • No labels