Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Wednesday  Aug 20th,  2014 9:00pm Pacific Standard Time, Thursday Aug 21st 2014 9:00am Pakistan time, Thursday Aug 21st, 2014 12:00 noon Malaysian time, Thursday Aug 21st, 2014 1:00am Rio Standard Time.

Attendees

Invitees:

Anjum-, Hassaan Khaliq, Kashif+, Raja+,  Johari+, Nara, Adnan+, Abdullah, Badrul, Ridzuan, Ibrahim+, Hanan, Saqib+, Adib-, Les+, Renan, Bebo+

+ Confirmed attendance

- Responded but  Unable to attend: 

Actual attendees:

Kashif, Raja, Johari, Adnan, Les, Bebo

Administration

  • The connectivity was dreadful, lots of noise, restarted several times.  Maybe we should try Google hangouts.

  • Anjum reports (6/23/2014) that "the proposal for conference has been submitted for approval and Pinger has been added in the agenda. Travel expenses for Les and Bebo have also been included in the conference proposal. We are awaiting the proposal approval. 8/16/2014: The faculty management at UM just changed and many matters required urgent attention. Abdullah will be able to update us soon once he gets a chance to see the Vice Chancellor. Once the approval is given the venue for the conference can be at UM or UUM.


    As discussed earlier, the only twist here is that Pinger will be seen as a case study for big data. This is good in a sense that people interested in doing research in the domain of big data can deploy pinger monitoring nodes at their respective universities/organizations and in return, play around with the data. We agreed that it looked like the 25th would be a good day for the PingER workshop. Les should be able to make it from Burkina Faso, and Bebo should be able to get back to the US for Thanksgiving.  There would be back to back presentation on how PingER gathers, archives data, what data there is, the data types, how to access etc.  by Les followed by Bebo on Google Tools for Big data.

  • Anjum suggested putting together a paper on metrics provided by PingER for Sigmetrix. The due date is in November.

Renan

Les met with Renan and his superviser (Maria Luiza Campos). The minutes are at: https://confluence.slac.stanford.edu/display/IEPM/20140703+Meeting+between+UFRJ+and+SLAC

Luiza has set up a small project in the UFRJ Reference center to provide big data analysis/mining of PingER multidimensional data

Luiza has proposed three approaches to provide big data analysis/mining of PingER multidimensional data:

  1. Conventional. Utilization of Pentaho environment to handle big multidimensional data, which enables utilization of enhanced user interfaces.
  2. Linked Data. Benchmarking of more sophisticated Triple Stores than the one we use today at PingER LOD (Sesame). Preferably, we should analyze parallel and distributed solutions. CumulusRDF is an example.  Utilization of Greenplum (http://en.wikipedia.org/wiki/Greenplum). This is an intensive high performance database from EMC with many features such as caching. It is partly from the EMC acquisition of Pivotal. There is also a DBMS called Grindplan that explores lots of features using Pivotal.
    1. Renan is investigating an alternative to Hadoop, which utilizes a Scientific Workflow Management System and makes use of Map/Reduce paradigm to help both querying and provenance of the Linked Data (RDF) data.
    2. Ibrahim is investigating an approach that utilizes Hadoop Map/Reduce in a Key/Value store with PingER data in RDF.

Les will make available via FTP examples of PingER data. There are two types:

  1. Raw data as gathered daily from all the monitoring hosts. This data is ie measured at 30 minute intervals and is quite dirty.
  2. Analyzed data by metric. This has been cleaned up. Les recommends UFRJ uses the cleaned up data., 

The instructions for the data will also be sent to Luiza.  Also see PingER data flow at SLAC.

Following the last meeting Les made available via FTP examples of PingER data. There are two types:

  1. Raw data as gathered daily from all the monitoring hosts. This data is ie measured at 30 minute intervals and is quite dirty.
  2. Analyzed data by metric. This has been cleaned up. Les recommends UFRJ uses the cleaned up data., 

The instructions for the data were also sent as well as size estimates and information on how PingER data has been used.

Maria and Renan are advancing in some approaches to deal with PingER data, making it easier to be analyzed and integrated. In particular they have been busy studying and evaluating alternatives, analyzing results from the latest benchmarks on NoSQL (including RDF and graph based storages) database management, distributed processing and mediated  solutions over relational databases, and also other experiments with multidimensional analyses on Linked Data.  The new students involved are now understanding better the scenario and they have been interacting with Renan regularly. 

They have separated the tasks into 2: 

  1. Quantitative analysis on PingER data
    1. They want to know how PingER has grown, since 1998 until today and how it might be in the next years. By doing this, we may focus on more suitable technologies that deal with scenarios that have a similar profile with PingER.
      1. Two students are working on this.
  2. Approaches to handle PingER current data
    1. Conventional approach – Utilization of Cassandra as back-end database to provide easy crossing of parameters to get PingER data.
      1. One student is working on this.
    2. Distributed and parallel approach – Utilization of a data warehouse on top of a distributed file system to provide low latency response to complex queries (like the ones we were not able to do on my previous work). Additionally, how Scientific Workflow Management Systems may help in the ETL process of transforming PingER so it can easily be stored on the data warehouse.
      1. Renan is working on this.
    3. Pure RDF approach – Good ways of modeling and natively storing RDF data.
      1. Maria-Luiza is working on this.
    4. NoSQL approaches – How other NoSQL DBMS may be adequate for PingER multidimensional data.
      1. Two students are evaluating existing NoSQL solutions for multidimensional scenarios (such as PingER)
    5. Key-Value storages for PingER data in RDF
      1. This is Ibrahim’s work.

In the end, they want to compare all these approachesLes will also send Luiza information on how PingER data has been used.

UM

The ping server at http://pinger.fsktm.um.edu.my/cgi-bin/traceroute.pl?target=www.slac.stanford.edu&function=ping gives ping server busy at the moment. Please try again later. Some one with access to the web servers should look at that (e.g. review the web logs). Maybe it is being hit with a lot of requests simultaneously. If they are coming from SLAC we may want to look at reflector.cgi.

Badrul (6/23/2014) is still awaiting hearing from his student (Abdulrahim Haroun Ali who is out of the country) on  the paper on anomalies in PingER measurements  and will update later once the paper ready. For the minute the paper is not ready. No update 8/20/2014.

Ridzuan has put together a rough proposal to use Hadoop to store and make available PingER data.  He has registered for the Myren cloud services last month. But until now still not getting any approval for the use of the mentioned services. Will follow up again with them. For the Hadoop implementation, He is  considering the use of Hortonworks Hadoop Data (HDP2) platform, however there are some problems with the latest installation because UM adopted IPV6. Most of the HDP2 repositories are resided in IPV4 server thus make it difficult to correctly install to our server. He is trying to use another platform or find a way to solve this installation probleminstallation problem. No update 8/20/2014.

Ibrahim Abaker  is planning to work on a topic initially entitled " leveraging pingER big data with a modified pingtable for event-correlation and clustering".  Ibrahim has a proposal, see https://confluence.slac.stanford.edu/download/attachments/17162/leveraging+pingER+big+data+with+a+modified+pingtable+for+event-correlation+and+clustering.docx. Ibrahim reports 7/15/2014 "I have spent the last few months trying to understand the concept of big data storage and its retrieval as well as the traditional approach of storing RDF data. I have integrated a single hadoop cluster in our cloud. but for this project we need multiple clusters, which I have already discussed with Dr. Badrul and he will provide me with big storage for the experiment." No Update 8/20/2014.

"I have come up with initial proposed solution model. This model consists of several parts. The upper parts of the Figure below shows the data source, in which PingER data will be convert into RDF format. Then the data pre-processor will take care of converting RDF/XML into N-triples serialization formats using N-triples convertor module. This N-triple file of an RDF graph will be as an input and stores the triples in storage as a key value pair using MapReduce jobs"

Image Removed

Les fowarded by email the information from Ibrahim to Renan following the meeting

UNIMAS

Johari is unable to attend this skype meeting. Dr. Adnan Shahid Khan who recently joined UNIMAS, will represent UNIMAS.  Adnan is coming up to speed. Adnan is on the pinger-my email list as of April 25, 2014. Adnan met yesterday with Johari. 

Johari says there is no progress on the following, the student may take up some of these issues after Ramadan:

...

stores the triples in storage as a key value pair using MapReduce jobs"

Image Added

Les forwarded by email the information from Ibrahim to Renan following the meeting

UNIMAS

Pinger 2 (Raspberry Pi) is working with ping server, making PingER measurements  and gathering data all successful. A next step will be to see if it is reliable and if there are significant differences between it and the pinger host at UNIMAS

The tool to enable synchronizing Malaysian

...

sites: added request from Saqib to sort the sites by country. Also have added another page to view statistic of sites by country. Have completed Troubleshoot and solve issues with form when inserting and updating record. The new page is available from the following page (two links on top of the table)                http://pinger.unimas.my/pinger/sites.php

Traceroute server: Status unsolved. The problem is the same on Pinger2. Johari talked to the network administrator at the centre about this issues and he suggested to talk to the security manager to check whether the firewall is blocking the icmp packer from the traceroute command (to do list)

...

The traceroute server at http://pinger2.unimas.my/cgi-bin/traceroute.pl  has the same problem as before. They know (sort of) the problem but haven't got the chance to rectify it (mapping NAT address, needs to be added). There is no progress 12/4/2013, 1/8/2013, 1.22.2014, 2/5/2014, 3/26/2014, 4/8/2014, 4/23/2014, 6/4/2014. Now that the historical traceroutes are working for UM (see below) there is an extra incentive to get the reverse traceroute working at UTM and UNIMAS

Custom iso: He can get as far as the boot screen, but is unable to get to the desktop. No progress 2/5/2014, 4/23/2014, 6/4/2014 They will start work on it next week.

Johari has created a shell script to automate the installation of pinger package in Ubuntu/Fedora/Centos Linux distro.  The finalized installation script for pinger software for CentOS/Fedora based distribution is available from the following url (past the step by step tutorial): http://pinger.unimas.my/pinger/install-tutorial.php. Kashif assisted in testing the script.Johari has a research student who finalized a proposal in order to officially apply for his masters.  He will start in February. He is currently working on threshold/anomaly detection, and will extend to correlating performance over multiple routes. He will share the proposal with Les and others April. No progress 6/4/2014.

Research: one student currently doing master by research on pinger project. Progress is a bit slow since the student lacks sound technical and programming skill to implement potential solution. Also will supervise another Advanced Project (Master by coursework) this coming sep 2014. Planning to investigate whether two pinger monitoring host has any differences in term of data collection (pinger and pinger2 nodes in UNIMAS

UTM 

Saqib has talked to MYREN they say the routers at hops 5 and 6 in the traceroute from UM to UNIMAS are both at UM and the long delay between them is due to congestion. I am skeptical since hops 6-12 have similar RTT and 12 is near Kuching. I suggest Saqib run mtr for a day or more from UM to UNIMAS see if there is any day night variation in RTT. If min RTT gets down to <2 ms, then the MYREN guy is right. If it is ~ 50ms  and persists for several days then it really does not look like congestion (which should vary day night as the number of users changes). In that case then it really appears hop 6 is physically close to hop 12 since they have the same min RTT. Taken together with the email from UM I would be very suspicious of the MYREN statement.

Saqib met with MYREN who have made many topology changes. Saqib will also incorporate these into the Malaysian case study. He is seeing anomalously long delays between mainland Malaysia and Sarawak. It does not appear to be due to congestion. We need to understand the routing and which undersea cables are being used. Saqib will send more details after the meeting. He will also contact MYREN.

from UM to UNIMAS are both at UM and the long delay between them is due to congestion. Les is skeptical since hops 6-12 have similar RTT and 12 is near Kuching. Les suggested Saqib run mtr for a day or more from UM to UNIMAS see if there is any day night variation in RTT. If min RTT gets down to <2 ms, then the MYREN guy is right. If it is ~ 50ms  and persists for several days then it really does not look like congestion (which should vary day night as the number of users changes). In that case then it really appears hop 6 is physically close to hop 12 since they have the same min RTT. Taken together with the email from UM I would be very suspicious of the MYREN statement. Is there an update?

Saqib has updated and re-submitted  his proposal to FRGS. Saqib sent a copy to Anum and Les. Anjum is Saqib's proposal was submitted  his proposal to FRGS. He received a requested revision. Anjum was going to help edit the proposal.

Saqib also reports there is a change in traceroute from SLAC to UNIMAS in this month. Previously, it goes via TEIN3. Further, it did not touch the MYREN networknetwork. However, the traceroute from SLAC to UTM and UM remains the same. Johari is going to look at.

Les did a binary search (using http://www-wanmon.slac.stanford.edu/cgi-wrap/traceroutearchive.cgi?from=www-wanmon.slac.stanford.edu&to=pinger.unimas.my&date1=2014_06_17&date2=2014_06_18&date3=2014_06_19) it . It appears the change happened on Wed 2014_06_18 (there is actually no trace route measured that day) . I have Les has no idea of the significance, Saqib may need to check with your his contacts in Malaysia.  Traceroute from UM to SLAC:

...

6 eqxsjrt1-ip-a-sunncr5.es.net (134.55.38.146) [AS293] 24.495 ms
7 sj-igw01.tm.net.my (206.223.116.120) [AS4637] 2.668 ms
8 *
9 58.26.240.62 (58.26.240.62) [AS4788] 215.915 ms
However, the traceroute from SLAC to UTM and UM remains the same.

BTW you can examine the geography of the above routes by cutting and pasting them into http://www-wanmon.slac.stanford.edu/cgi-wrap/reflector.cgi?function=vtrace

UUM

Regarding the monitoring host in UUM, Adib has assigned one student to prepare the configuration/installation plan including how to secure their host from attack. He has a public IP address.  He needs to the DNS registration by Sunday 25th May or Monday.  He is in the last stage of working with the Computer Center. Adib requested Johari to share  the UNIMAS setting so it is easier for the student to follow. No update 6/5/2014, 6/25/2014. 

...

.55.38.146) [AS293] 24.495 ms
7 sj-igw01.tm.net.my (206.223.116.120) [AS4637] 2.668 ms
8 *
9 58.26.240.62 (58.26.240.62) [AS4788] 215.915 ms

BTW you can examine the geography of the above routes by cutting and pasting them into http://www-wanmon.slac.stanford.edu/cgi-wrap/reflector.cgi?function=vtrace

UUM

Adib reports they completed the installation of the PingER server at UUM. However, the second hand (old) machine is not working properly :(, suddenly shut down/restart!

NUST

Installation is in progress for the Bahawalpur site. Install complete needs approval from head, hopefully up on Monday.

The following are now up and running:

  • buitms.seecs.edu.pk (electrical problem appears solved, pk, working(load shedding problem is much better now). The time of server was also incorrect(almost 24 hrs difference) which has been corrected now. ).

Kashif is working on:

  • pinger.kohat.edu.pk, Still, trying to find motherboard of Dell Optiplex 760. System, is old, hard to find motherboard, hope to solve soon. Currently the name is not resolving

  • also host pinger.nwfpuet.edu.pk name is not resolvingworking now.

  • there is was no data this month from pingerfsbd.pern.edu.pk and the host is was unpingable from SLAC, it is working now

  • sau.seecs.edu.pk has been was ungatherable (and unpingable) since Aug 7th, prior to that it was unreliable, it is working now

  • www.upesh.edu.pk is pingable from SLAC but can't gather data from SLAC Problem has been resolved by replacing new files and by giving full rights. Ping to is OK, data is also being collected now, however ping from is giving permission denied error.

Raja

Raja is back in Pakistan.

...

Anjum suggested Saqib,  Badrul and Johari put together a paper on user experiences with using the Internet in Malaysia as seen from Malaysian universities. In particular round trip time, losses, jitter, reliability, routing/peering, in particular anomalies, and the impact on VoIP, throughput etc.  It would be good to engage someone from MYREN.

Potential projects

See list of Projects

Future meeting  - Les

Next meeting Wednesday August 20th September 17th  2014 9:00pm Pacific Standard Time, Thursday August 21st September 18th 2014 9:00am Pakistan time, Thursday August 21stSeptember 18th, 2014 noon Malaysian time, Thursday August 21st September 18th, 2014 01:00am Rio Standard Time.

Coordinates of team members:

...