Time & date 

Wednesday  July 16th,  2014 9:00pm Pacific Standard Time, Thursday July 17th 2014 9:00am Pakistan time, Thursday July 17th, 2014 12:00 noon Malaysian time, Thursday Jul 17th, 2014 1:00am Rio Standard Time.

Attendees

Invitees:

Anjum, Hassaan Khaliq, Kashif+, Raja+,  Johari-, Nara, Adnan+, Abdullah, Badrul, Ridzuan+, Ibrahim+, Hanan, Saqib+, Adib, Les+, Renan, Bebo+

+ Confirmed attendance

- Responded but  Unable to attend: 

Actual attendees:

Anjum, Kashif, Raja, Adnan, Ibrahim, Saqib, Les, Bebo

Administration

  • Anjum reports (6/23/2014) that "the proposal for conference has been submitted for approval and Pinger has been added in the agenda. Travel expenses for Les and Bebo have also been included in the conference proposal. We are awaiting the proposal approval.Prof. Abdullah Gani says we shall get the approval soon (7/16/2014). Once the approval is given the venue for the conference can be at UM or UUM.

    As discussed earlier, the only twist here is that Pinger will be seen as a case study for big data. This is good in a sense that people interested in doing research in the domain of big data can deploy pinger monitoring nodes at their respective universities/organizations and in return, play around with the data. We agreed that it looked like the 25th would be a good day for the PingER workshop. Les should be able to make it from Burkina Faso, and Bebo should be able to get back to the US for Thanksgiving.  There would be back to back presentation on how PingER gathers, archives data, what data there is, the data types, how to access etc.  by Les followed by Bebo on Google Tools for Big data.

  • Anjum suggested putting together a paper on metrics provided by PingER for Sigmetrix. The due date is in November.

Renan

Les met with Renan and his superviser (Maria Luiza Campos). The minutes are at: https://confluence.slac.stanford.edu/display/IEPM/20140703+Meeting+between+UFRJ+and+SLAC

Luiza has set up a small project in the UFRJ Reference center to provide big data analysis/mining of PingER multidimensional data

Luiza has proposed three approaches to provide big data analysis/mining of PingER multidimensional data:

  1. Conventional. Utilization of Pentaho environment to handle big multidimensional data, which enables utilization of enhanced user interfaces.
  2. Linked Data. Benchmarking of more sophisticated Triple Stores than the one we use today at PingER LOD (Sesame). Preferably, we should analyze parallel and distributed solutions. CumulusRDF is an example.  
    1. Renan is investigating an alternative to Hadoop, which utilizes a Scientific Workflow Management System and makes use of Map/Reduce paradigm to help both querying and provenance of the Linked Data (RDF) data.
    2. Ibrahim is investigating an approach that utilizes Hadoop Map/Reduce in a Key/Value store with PingER data in RDF.
  3. Utilization of Greenplum (http://en.wikipedia.org/wiki/Greenplum). This is an intensive high performance database from EMC with many features such as caching. It is partly from the EMC acquisition of Pivotal. There is also a DBMS called Grindplan that explores lots of features using Pivotal.

Les will make available via FTP examples of PingER data. There are two types:

  1. Raw data as gathered daily from all the monitoring hosts. This data is ie measured at 30 minute intervals and is quite dirty.
  2. Analyzed data by metric. This has been cleaned up. Les recommends UFRJ uses the cleaned up data., 

The instructions for the data will also be sent to Luiza.  Also see PingER data flow at SLAC.

Les will also send Luiza information on how PingER data has been used.

UM

Badrul (6/23/2014) is still awaiting hearing from his student (Abdulrahim Haroun Ali who is out of the country) on  the paper on anomalies in PingER measurements  and will update later once the paper ready. For the minute the paper is not ready.

Ridzuan has put together a rough proposal to use Hadoop to store and make available PingER data.  He has registered for the Myren cloud services last month. But until now still not getting any approval for the use of the mentioned services. Will follow up again with them. For the Hadoop implementation, He is  considering the use of Hortonworks Hadoop Data (HDP2) platform, however there are some problems with the latest installation because UM adopted IPV6. Most of the HDP2 repositories are resided in IPV4 server thus make it difficult to correctly install to our server. He is trying to use another platform or find a way to solve this installation problem.

Ibrahim Abaker  is planning to work on a topic initially entitled " leveraging pingER big data with a modified pingtable for event-correlation and clustering".  Ibrahim has a proposal, see https://confluence.slac.stanford.edu/download/attachments/17162/leveraging+pingER+big+data+with+a+modified+pingtable+for+event-correlation+and+clustering.docx. Ibrahim reports 7/15/2014 "I have spent the last few months trying to understand the concept of big data storage and its retrieval as well as the traditional approach of storing RDF data. I have integrated a single hadoop cluster in our cloud. but for this project we need multiple clusters, which I have already discussed with Dr. Badrul and he will provide me with big storage for the experiment."

"I have come up with initial proposed solution model. This model consists of several parts. The upper parts of the Figure below shows the data source, in which PingER data will be convert into RDF format. Then the data pre-processor will take care of converting RDF/XML into N-triples serialization formats using N-triples convertor module. This N-triple file of an RDF graph will be as an input and stores the triples in storage as a key value pair using MapReduce jobs"

Les fowarded by email the information from Ibrahim to Renan following the meeting

UNIMAS

Johari is unable to attend this skype meeting. Dr. Adnan Shahid Khan who recently joined UNIMAS, will represent UNIMAS.  Adnan is coming up to speed. Adnan is on the pinger-my email list as of April 25, 2014. Adnan met yesterday with Johari. 

Johari says there is no progress on the following, the student may take up some of these issues after Ramadan:

The Raspberry Pi is at the data centre and has a public IP address. It was working last week, until the UPS failed over the weekend. It did not reboot itself. Johari will look at the problem. 

The tool to enable synchronizing Malaysian monitors is completed. It has been tested by Saqib. Saqib requested to add sorting the HostList by country, Johari will add this.

The traceroute server at http://pinger2.unimas.my/cgi-bin/traceroute.pl  has the same problem as before. They know (sort of) the problem but haven't got the chance to rectify it (mapping NAT address, needs to be added). There is no progress 12/4/2013, 1/8/2013, 1.22.2014, 2/5/2014, 3/26/2014, 4/8/2014, 4/23/2014, 6/4/2014. Now that the historical traceroutes are working for UM (see below) there is an extra incentive to get the reverse traceroute working at UTM and UNIMAS

Custom iso: He can get as far as the boot screen, but is unable to get to the desktop. No progress 2/5/2014, 4/23/2014, 6/4/2014.

Johari has created a shell script to automate the installation of pinger package in Ubuntu/Linux distro.  He has finished  the Fedora and Centos implementations. He will give it to Kashif to test.   Johari has added a page at pinger.unimas.my/pinger website on the usage of a shell script to automate the installation steps for pinger package. It is available at  http://pinger.unimas.my/pinger/install-tutorial.php. 

Johari has a research student who finalized a proposal in order to officially apply for his masters.  He will start in February. He is currently working on threshold/anomaly detection, and will extend to correlating performance over multiple routes. He will share the proposal with Les and others April. No progress 6/4/2014.

UTM 

Saqib has talked to MYREN they say the routers at hops 5 and 6 in the traceroute from UM to UNIMAS are both at UM and the long delay between them is due to congestion. I am skeptical since hops 6-12 have similar RTT and 12 is near Kuching. I suggest Saqib run mtr for a day or more from UM to UNIMAS see if there is any day night variation in RTT. If min RTT gets down to <2 ms, then the MYREN guy is right. If it is ~ 50ms  and persists for several days then it really does not look like congestion (which should vary day night as the number of users changes). In that case then it really appears hop 6 is physically close to hop 12 since they have the same min RTT. Taken together with the email from UM I would be very suspicious of the MYREN statement.

Saqib met with MYREN who have made many topology changes. Saqib will also incorporate these into the Malaysian case study. He is seeing anomalously long delays between mainland Malaysia and Sarawak. It does not appear to be due to congestion. We need to understand the routing and which undersea cables are being used. Saqib will send more details after the meeting. He will also contact MYREN.

Saqib's proposal is almost ready however we do not see somewhere (funding agency) to submit it to. The next round of the FRGS may be the next opportunity. Anjum and Saqib will discuss where best to fit Saqib's proposal and Anjum will help edit the proposal.

UUM

Regarding the monitoring host in UUM, Adib has assigned one student to prepare the configuration/installation plan including how to secure their host from attack. He has a public IP address.  He needs to the DNS registration by Sunday 25th May or Monday.  He is in the last stage of working with the Computer Center. Adib requested Johari to share  the UNIMAS setting so it is easier for the student to follow. No update 6/5/2014, 6/25/2014. 

UUM pinger is almost ready. Adib has got an public IP address together with a dns name. Once this is settled the tracreoute.pl will follow. This  will increase  the number of  landmarks in mainland Malaysia by 50% and improve geolocation. Adib plans to get to this next week when he returns from vacation.

NUST

Installation is in progress for the Bahawalpur site. Install complete needs aproval from head, hopefully up on Monday

The following are now up and running:

  • sau.seecs.edu.pk, Solved, UP and Running.

  • pingerkhi-uok.pern.edu.pk, Solved, UP and Running.

There are also several sites that seem to have power problems and are often not available at the normal early morning (Pacific time) gathering time, in particular buitms.seecs.edu.pk and www.upesh.edu.pk which can be down for days at a time. I do have a script that will ping the site at regular intervals and when/if it finally responds then try and gather the data. However we do not have a satisfactory solution to gathering data from these sites. Buitms is an electrical problem thta they hope to solve in a month. Upesh only pings within the country, this not understood at the moment.

The following sites have been dropped. 

  • duhs.seecs.edu.pk, Dropped

  • uaf.seecs.edu.pk,  Dropped

Kashif is working on:

  • pinger.kohat.edu.pk, Still, trying to find motherboard of Dell Optiplex 760. System, is old, hard to find motherboard, hope to solve soon.

Raja

Raja has added an optional feature to exclude water areas from the acceptable area. This reduces the error (proportional to area), but sometimes leads to a less accurate centroid (for the US sites, only 11 had water, the centroid of 5 showed improvement, 6 got worse). Currently it is only available for N. America. Here is a confluence page showing few examples: TULIP AIG with water exclusion

The number of working, usable landmarks is now up to 340. 

Replaced non responding Beacon with working hosts. The table below shows the changes made:

 

Beacon not workingCountryPossible Replacement (in node details)Replaced
AM.SCI.N1ArmeniaAM.HRAPARAK.N1yes
CM.CAMNET.N1CameroonCM.MINEPAT.GOV.WWWYes
DZ.UNIV-SBAAlgeriaNone (all nodes are down) 
EC.IMPSAT.NET.N1EcuadorEC.FDE.N1Yes
GOV.FNAL.N1USEDU.BU.N1Yes
IL.WEIZMANN.AC.N1IsraelIL.TAU.AC.N1yes
JO.SESAME.ORG.N1Jordon(M)JO.ASPU.EDU.N1 Fixed
JP.KEK.N1JapanJP.APAN.NET.N2Disabled (already have another beacon)
KH.CAMNET.COM.N1CambodiaKH.BELTEI.EDU.N1Yes
NA.ADSL-ISP.COM.N1NamibiaNA.AGRINAMIBIA.COM.N1yes
NZ.WAIKATO.AC.N3New ZealandNZ.AIC.AC.N1yes
RW.KIST.AC.N1RwandaNone (all nodes are down) 
SE.SU.N1SwedenNone 
ZM.AISHA.AC.N1ZambiaZM.ZCUNI.EDUyes


Jordon monitor was down since 12th June. Has been fixed after emailing the contact.

Raja leaves SLAC early August to return to Pakistan.

PingER at SLAC

Les requested an update from Yahoo about TULIP's geolocation. They answered "We are very much interested in getting IP triangulation at internet scale, we will have internal sync-up on how we can leverage this initiative if there is rate limit and get back. Regarding opening up yahoo sites for deploying ping server requires some more time to discuss this with relevant stake holders with in yahoo." No word, sent a reminder 5/19/2014. No response 6/4/2014, 6/25/2014.. 

Les sent email to Google as follows: "I would like to bring to your attention that we have developed a geolocation tool using delay based (using RTTs from known ping server landmarks) distance estimates to triangulate the location of an IP host target. The tool is accessible at: http://www-wanmon.slac.stanford.edu/cgi-wrap/reflex.cgi. We have identified that the accuracy of the geolocation is directly related to the landmark density (e.g. # of landmarks/ million sq km). The higher the density the smaller the error and the fall off is exponential. We currently have over 1000 registered landmarks, of which at any given time ~300 are working. The tool not only finds the location of the target, it also gives an estimated error. To the best of our knowledge it is the only freely available delay based measurement geolocation service publicly available today. A drawback (compared to database methods such as those based on GeoMind) is the time taken to make the measurements. We have worked on this from many directions including parallelization of the ping requests, caching, tiering to get the rough location (i.e. region of the world) then zooming in using all landmarks in the region. We are putting together a publication on this." Les sent an update to his contact at Google 6/23/2014, stressing the applicability to traceroute visualization. No response 7/13/2014.

Old Items

Linked Open Data

Renan  finished the new pingerlod web site. The new thing is that it should be much easier now to modify the info texts. What Renan did was to put the texts into a separate file. The new version has been loaded on the server and some text added to describe how to use the map. However there is a bug that prevents it from executing the map. Renan reports that the bugs should be easy to fix. He has talked to his professor who suggested trying RDF Owlink, it should have faster responses to queries. Renan will research this.  It will probably mean reloading the PingER data so is a lot of work, hopefully this will improve performance. Before the rebuild he will make the fixes and provide a new WAR for us to load on pingerlod.slac.stanford.edu. He is also working on documentation (he has finished the ontology and has a nice interactive tool for visualizing it, since the ontology is the core of the data model of our semantic solution, this will be very helpful for anyone who uses our system, both a developer of the system and a possible user) and his thesis. Bebo pointed out that to get publicity and for people to know about the data, we will need to add pingerlod to lod.org.

Things he will soon do regarding documentation:

  1. A task/process flow writing all java classes involved on all those batch jobs;
  2. A Javadoc <http://www.oracle.com/technetwork/java/javase/documentation/index-jsp-135444.html> which will explain all classes and how they are used.

For the Linked Open Data / RDF which is in pre-alpha days, you can go to http://pingerlod.slac.stanford.edu. As can be seen this page is not ready for prime time. However the demos work as long as one carefully elects what to look at:

  • Click on Visualizations, there are two choices:
    • Multiple Network Metrics: Click on the image: gives a form, choose from Node pinger.slac.stanford.edu pinging to www.ihep.ac.cn, time parameters yearly, 2006 2012, metrics throughput, Average RTT Packet loss and display format Plot graph, then click on submit. In a few seconds time series graph should come up. Mouse over to see details of values at each x value (year).
    • A mashup of network metrics x university metrics Click on image: gives another form, pinging from pinger.slac.stanford.edu, School metric number of students, time metric years 2006 2012, display format plot graph, click on submit. Longer wait, after about 35 seconds a google map should show up. Click on "Click for help." Area of dots = number of students, darkness of dots = throughput (lighter is better), inscribing circle color gives university type (public, private etc.) Click on circle for information on university etc.
  • Renan will be working on providing documentation on the programs, in particular the install guide for the repository and web site etc. This will assist the person who takes this over. 

Renan is using OWLIM as RDF Repository. He is using an evaluation version right now. Renan looked into the price for OWLIM (that excellent RDF Database Management System he told us about). It would cost 1200EUR minimum  (~ 1620 USD, according to Google's rate for today) for a one time eternal license. It seems too expensive. No wonder it is so good. Anyhow, he heard about a different free alternative. Just not sure how good it would be for our PingER data. He will try it out and evaluate. He will also get a new evaluation of the free OWLIM lite.  

He has also made some modifications on the ontology of the project (under supervision of his professor in Rio) hence he  will have to modify the code to load the data accordingly.

Renan has provided a 4 page Appendix on PingERLOD to the ICFA report.  This is also available at PingER LOD Overview

Raspberry Pi

A quick comparison of the performance of the two hosts (raspberry pi and regular UNIMAS host) without statistical quantification is available at https://confluence.slac.stanford.edu/display/IEPM/Comparison+of+PinGER+RTTs+from+UNIMAS+monitors+N4+and+RASPBERRY.  A page has been created to compare the hardware spec between the pinger.unimas.my node (Intel architecture) and the pinger2.unimas.my node (Raspberry Pi ARM architecture), available from the unimas pinger website at http://pinger.unimas.my/pinger/hardware.php. There is a link to hardware.php in the Comparison+of+PinGER+RTTs+from+UNIMAS+monitors+N4+and+RASPBERRY web page.

NUST

At the Connect Asia Pacific Summit in Bangkok in  January and seeing the  project "Mapping the pan Asia Pacific information Superhighway and closing gaps in infrastructure  connectivity" Shahryar found that very much related to the work in the PingER project. So Shahryar sent email to a UN agency for a possible collaboration with them on PingER project. He has heard nothing so he will write a detailed proposal and then should contact them again. No update 2/5/2014, 3/5/2014.

Tulip
Follow up from workshop
  • Hossein Javedani of UTM is interested in anomalous event detection with PingER data. Information on this is available at https://confluence.slac.stanford.edu/display/IEPM/Event+Detection. We have sent him a couple of papers and how to access the PingER data. Hossein and Badrul have been put in contact. Is there an update Badrul?

The Next step in funding is to go for bigger research funding, such as LRGS or eScience. Such proposals must lead to publications in high quality journals. They will need an infrastructure such as the one we are building. We can use the upcoming workshop (1 specific session) to brainstorm and come up with such proposal. We need to do some groundwork before that as well. Johari will take the lead in putting together 1/2 page descriptions of the potential research projects. 

  1. Need to identify a few key areas of research related to PingER Malaysia Initiative and this can be shared/publicized through the website. These might include using the infrastructure and data for: anomaly detection; correlation of performance across multiple routes; and for GeoLocation. Future projects as Les listed in Confluence herehttps://confluence.slac.stanford.edu/display/IEPM/Future+Projects can also be a good start and also Bebo's suggestion. 
  2. Need to synchronize and share research proposals so as not to duplicate research works. how to share? Maybe not through the website, or maybe can create a member only section of the website to share sensitive data such as research proposal?

Anjum suggested Saqib,  Badrul and Johari put together a paper on user experiences with using the Internet in Malaysia as seen from Malaysian universities. In particular round trip time, losses, jitter, reliability, routing/peering, in particular anomalies, and the impact on VoIP, throughput etc.  It would be good to engage someone from MYREN.

Potential projects

See list of Projects

Future meeting  - Les

Next meeting Wednesday August 20th  2014 9:00pm Pacific Standard Time, Thursday August 21st 2014 9:00am Pakistan time, Thursday August 21st, 2014 noon Malaysian time, Thursday August 21st, 2014 01:00am Rio Standard Time.

Coordinates of team members:

See: http://pinger.unimas.my/pinger/contact.php

  • No labels