Time & date 

Wednesday  Oct 15th,  2014 9:00pm Pacific Standard Time, Thursday Oct 16th 2014 9:00am Pakistan time, Thursday Oct 16th 12:00 noon Malaysian time, Thursday Sep 16th, 2014 1:00am Rio Standard Time.

Attendees

Invitees:

Anjum+, Hassaan Khaliq, Kashif, Raja,  Johari, Nara, Adnan, Abdullah, Badrul, Ridzuan, Ibrahim, Hanan, Saqib+, Adib+, Les+, Renan, Bebo+

+ Confirmed attendance

- Responded but  Unable to attend: 

Actual attendees:

Adib, Saqib, Raja, Les.

The quality of the connections was very poor, loud background noises at times, many attempts to make up call or add people. Not all were on at the same time. Les had the wrong Skype address for Anjum so Anjum never got a call.

After the meeting we added Samad at SEECS to the list of invitees.

Les gathered more updates by email after the meeting and included them in the minutes

Administration

  • With regard to Pinger workshop, UUM have proposed to host a Pinger workshop in conjunction with MYREN Seminar to be held on the 26th of November at Berjaya Time Square. Our workshop will be hosted by MYREN InterNetWorks SIG and this is an opportunity to attract  participants from all MYREN SIGs.
  • Anjum suggested putting together a paper on metrics provided by PingER for Sigmetrix. The due date is in November. Does someone want to take the lead - Anjum?
  • Anjum and Raja have been working on a paper on Geolocation as developed for TULIP

UUM

The RTTs from SLAC to UUM show plateaux which look like route changes. As of 10/13/2014 pinger.uum.edu.my was not pingable from SLAC. Adib agreed to fix after the meeting.

Renan

Renan and Christiane have analyzed the data in terms of the number of files, volume, missing files, files with no data etc.  They also have the breakdown by year.

UM

Badrul (6/23/2014) is still awaiting hearing from his student (Abdulrahim Haroun Ali who is out of the country) on  the paper on anomalies in PingER measurements  and will update later once the paper ready. For the minute the paper is not ready. No update 8/20/2014, 10/16/2014.

Ridzuan and Ibrahim both requested for additional space. We require an infrastructure update for the cloud because the existing storage space is not enough, given a number of other users. The 5 Tera byte SAS along with some blade servers all have minimum 1Gig interfaces. We are in process of purchasing the switch to support the networking. Once available, we shall be able to provide the capacity for one instance of SLAC data.

UNIMAS

Pinger 2 (Raspberry Pi) has been running successfully since Sept 2nd. See  ePingER project Malaysia for recent plots of data from pinger and pinger2.unimas.my (Raspberry Pi).  

Traceroute server: Status unsolved. The problem is the same on Pinger2. Johari talked to the network administrator at the centre about this issues and he suggested to talk to the security manager to check whether the firewall is blocking the icmp packer from the traceroute command (to do list). No progress 9/17/2014.

Custom iso: He can get as far as the boot screen, but is unable to get to the desktop. They will started work on it  but student is still unable to boot ISO (9/17/2014)

Research: one student currently doing master by research on pinger project. Progress is a bit slow since the student lacks sound technical and programming skill to implement potential solution. Also will supervise another Advanced Project (Master by coursework) this coming Sep 2014. Planning to investigate whether two pinger monitoring host has any differences in term of data collection (pinger and pinger2 nodes in UNIMAS). 
They looked at the potential projects and selected two. Putting together a framework for anomaly detection. Interested to know of any more projects

UTM

After revision the FRGS proposal was submitted to RMC. It is under review, expect feedback at the end of October.

Saqib is updating the case study from time to time. It just need some formatting. He will send it out  in the coming week. This is needed for the workshop.

pinger.fsktm.utm.my can't traceroute.pl or ping_data.pl

Saqib will see if he can modify pinger monitoring hop 6 of the traceroute.pl from UM to UNIMAS. He does have access to the pinger monitor at UM. We will be looking for whether there are any patterns for the big increases in RTT/congestion.

NUST

Dr. Arshad Ali (the Director General of SEECS) and the Rector of NUST visited SLAC. We discussed several issues including a graduate student for 6 months to work on high performance data transfer together with a start up company.

We are unable to gather data from cemb (can't ping), uetaxila (can't ping), pingerisl (can't ping) and upesh (can ping).  Samad responded Cemb will be up in a day (it was successfully completed 10/17/2014). Pingerisl and uetaxila need fresh installation but there is some issue with their access. He will fix them as soon as possible.

PingER at SLAC

The format of GeoIPTools (this provides country city, and lat long information of hosts) changed breaking some of PingER analysis scripts. This has been fixed.

Need to prepare  presentations for Burkina Faso, in particular updating presentations for Africa.

Need to update information for Anjum for the workshop.

Future meeting  - Les

Next meeting: Les is away from Nov 5 - Nov 24 getting back from Africa just in time for Thanksgiving. Wednesday December 10th  2014 9:00pm Pacific Standard Time, Thursday December 11th  2014 9:00am Pakistan time, Thursday December 11th 2014 noon Malaysian time, Thursday  December 11th, 2014 01:00am Rio Standard Time.

Old Items

Traceroute from UM to UNIMAS

According to Professor Francis Lee, SingAREN SLIX core router is a key node for international research and education networks – including APAN, GLORIAD, Internet2, and TEIN – and peers directly with Australia’s AARNet and Japan’s NII and NICT networks. The first 100 GbE international connection is likely to be made within the next year as a result of a US funding call for a 100 Gbps research network link to Asia.  PingER may be very valuable for seeing the impact.

Saqib has run the mtr for more than 3 days from UM to UNIMAS, UTM to UM, and UTM to UNIMAS. From UM to UNIMAS, best RTT from 6 to 9, gets down to ~2ms from ~49ms (for the worst case).  However, for hop 10 it stayed at ~42ms as shown in attached figure (UM-UNIMAS). Thus it appears the the MYREN guy is right that there is much congestion at hop 6 and possibly 7-9 for most of the time. Below are the UM-UNIMAS mtr results.

It sounds like the MYREN guy is right  that there is significant congestion at least at hop 6. I wonder if we can get a better handle on this by monitoring hop 6 from UM and seeing the time periods when the congestion occurs. Some one would need to add to pinger.xml <HostList> something like:

 <Host>
<Alarm>
<TimeOfFirstFailure>1410091899</TimeOfFirstFailure>
</Alarm>
<DnsLastChecked>1410982759</DnsLastChecked>
<IP>203.80.23.73</IP>
<Name> te-0-3-0-0.drc96.jaring.my</Name> 
</Host>

It is possible it may not respond to pings which will make this not useful. Check that first. I notice from SLAC:

 249cottrell@pinger:~$ping 203.80.23.73

PING 203.80.23.73 (203.80.23.73) 56(84) bytes of data.
^C
--- 203.80.23.73 ping statistics ---
55 packets transmitted, 0 received, 100% packet loss, time 54850ms

 However: 

 250cottrell@pinger:~$ping te-0-3-0-0.drc96.jaring.my PING te-0-3-0-0.drc96.jaring.my (61.6.51.2) 56(84) bytes of data.
64 bytes from te-0-3-0-0.drc96.jaring.my (61.6.51.2): icmp_seq=1 ttl=242 time=192 ms
64 bytes from te-0-3-0-0.drc96.jaring.my (61.6.51.2): icmp_seq=2 ttl=242 time=194 ms
64 bytes from te-0-3-0-0.drc96.jaring.my (61.6.51.2): icmp_seq=3 ttl=242 time=193 ms ^C
--- te-0-3-0-0.drc96.jaring.my ping statistics ---
4 packets transmitted, 3 received, 25% packet loss, time 3193ms rtt min/avg/max/mdev = 192.982/193.542/194.490/0.763 ms

I.e. the name refers to a different IP address.

Linked Open Data

Renan  finished the new pingerlod web site. The new thing is that it should be much easier now to modify the info texts. What Renan did was to put the texts into a separate file. The new version has been loaded on the server and some text added to describe how to use the map. However there is a bug that prevents it from executing the map. Renan reports that the bugs should be easy to fix. He has talked to his professor who suggested trying RDF Owlink, it should have faster responses to queries. Renan will research this.  It will probably mean reloading the PingER data so is a lot of work, hopefully this will improve performance. Before the rebuild he will make the fixes and provide a new WAR for us to load on pingerlod.slac.stanford.edu. He is also working on documentation (he has finished the ontology and has a nice interactive tool for visualizing it, since the ontology is the core of the data model of our semantic solution, this will be very helpful for anyone who uses our system, both a developer of the system and a possible user) and his thesis. Bebo pointed out that to get publicity and for people to know about the data, we will need to add pingerlod to lod.org.

Things he will soon do regarding documentation:

  1. A task/process flow writing all java classes involved on all those batch jobs;
  2. A Javadoc <http://www.oracle.com/technetwork/java/javase/documentation/index-jsp-135444.html> which will explain all classes and how they are used.

For the Linked Open Data / RDF which is in pre-alpha days, you can go to http://pingerlod.slac.stanford.edu. As can be seen this page is not ready for prime time. However the demos work as long as one carefully elects what to look at:

  • Click on Visualizations, there are two choices:
    • Multiple Network Metrics: Click on the image: gives a form, choose from Node pinger.slac.stanford.edu pinging to www.ihep.ac.cn, time parameters yearly, 2006 2012, metrics throughput, Average RTT Packet loss and display format Plot graph, then click on submit. In a few seconds time series graph should come up. Mouse over to see details of values at each x value (year).
    • A mashup of network metrics x university metrics Click on image: gives another form, pinging from pinger.slac.stanford.edu, School metric number of students, time metric years 2006 2012, display format plot graph, click on submit. Longer wait, after about 35 seconds a google map should show up. Click on "Click for help." Area of dots = number of students, darkness of dots = throughput (lighter is better), inscribing circle color gives university type (public, private etc.) Click on circle for information on university etc.
  • Renan will be working on providing documentation on the programs, in particular the install guide for the repository and web site etc. This will assist the person who takes this over. 

Renan is using OWLIM as RDF Repository. He is using an evaluation version right now. Renan looked into the price for OWLIM (that excellent RDF Database Management System he told us about). It would cost 1200EUR minimum  (~ 1620 USD, according to Google's rate for today) for a one time eternal license. It seems too expensive. No wonder it is so good. Anyhow, he heard about a different free alternative. Just not sure how good it would be for our PingER data. He will try it out and evaluate. He will also get a new evaluation of the free OWLIM lite.  

He has also made some modifications on the ontology of the project (under supervision of his professor in Rio) hence he  will have to modify the code to load the data accordingly.

Maria and Renan are advancing in some approaches to deal with PingER data, making it easier to be analyzed and integrated. In particular they have been busy studying and evaluating alternatives, analyzing results from the latest benchmarks on NoSQL (including RDF and graph based storages) database management, distributed processing and mediated  solutions over relational databases, and also other experiments with multidimensional analyses on Linked Data.  The new students involved are now understanding better the scenario and they have been interacting with Renan regularly. 

They have separated the tasks into 2: 

  1. Quantitative analysis on PingER data
    1. They want to know how PingER has grown, since 1998 until today and how it might be in the next years. By doing this, we may focus on more suitable technologies that deal with scenarios that have a similar profile with PingER.
      1. Two students are working on this.
  2. Approaches to handle PingER current data
    1. Conventional approach – Utilization of Cassandra as back-end database to provide easy crossing of parameters to get PingER data.
      1. One student is working on this.
    2. Distributed and parallel approach – Utilization of a data warehouse on top of a distributed file system to provide low latency response to complex queries (like the ones we were not able to do on my previous work). Additionally, how Scientific Workflow Management Systems may help in the ETL process of transforming PingER so it can easily be stored on the data warehouse.
      1. Renan is working on this.
    3. Pure RDF approach – Good ways of modeling and natively storing RDF data.
      1. Maria-Luiza is working on this.
    4. NoSQL approaches – How other NoSQL DBMS may be adequate for PingER multidimensional data.
      1. Two students are evaluating existing NoSQL solutions for multidimensional scenarios (such as PingER)
    5. Key-Value storages for PingER data in RDF
      1. This is Ibrahim’s work.

In the end, they want to compare all these approaches.

NUST

At the Connect Asia Pacific Summit in Bangkok in  January and seeing the  project "Mapping the pan Asia Pacific information Superhighway and closing gaps in infrastructure  connectivity" Shahryar found that very much related to the work in the PingER project. So Shahryar sent email to a UN agency for a possible collaboration with them on PingER project. He has heard nothing so he will write a detailed proposal and then should contact them again. No update 2/5/2014, 3/5/2014.

Tulip
Follow up from workshop
  • Hossein Javedani of UTM is interested in anomalous event detection with PingER data. Information on this is available at https://confluence.slac.stanford.edu/display/IEPM/Event+Detection. We have sent him a couple of papers and how to access the PingER data. Hossein and Badrul have been put in contact. Is there an update Badrul?

The Next step in funding is to go for bigger research funding, such as LRGS or eScience. Such proposals must lead to publications in high quality journals. They will need an infrastructure such as the one we are building. We can use the upcoming workshop (1 specific session) to brainstorm and come up with such proposal. We need to do some groundwork before that as well. Johari will take the lead in putting together 1/2 page descriptions of the potential research projects. 

  1. Need to identify a few key areas of research related to PingER Malaysia Initiative and this can be shared/publicized through the website. These might include using the infrastructure and data for: anomaly detection; correlation of performance across multiple routes; and for GeoLocation. Future projects as Les listed in Confluence herehttps://confluence.slac.stanford.edu/display/IEPM/Future+Projects can also be a good start and also Bebo's suggestion. 
  2. Need to synchronize and share research proposals so as not to duplicate research works. how to share? Maybe not through the website, or maybe can create a member only section of the website to share sensitive data such as research proposal?

Anjum suggested Saqib,  Badrul and Johari put together a paper on user experiences with using the Internet in Malaysia as seen from Malaysian universities. In particular round trip time, losses, jitter, reliability, routing/peering, in particular anomalies, and the impact on VoIP, throughput etc.  It would be good to engage someone from MYREN.

Ibrahim

Ibrahim Abaker  is planning to work on a topic initially entitled " leveraging pingER big data with a modified pingtable for event-correlation and clustering".  Ibrahim has a proposal, see https://confluence.slac.stanford.edu/download/attachments/17162/leveraging+pingER+big+data+with+a+modified+pingtable+for+event-correlation+and+clustering.docx. Ibrahim reports 7/15/2014 "I have spent the last few months trying to understand the concept of big data storage and its retrieval as well as the traditional approach of storing RDF data. I have integrated a single hadoop cluster in our cloud. but for this project we need multiple clusters, which I have already discussed with Dr. Badrul and he will provide me with big storage for the experiment." No Update 8/20/2014.

"I have come up with initial proposed solution model. This model consists of several parts. The upper parts of the Figure below shows the data source, in which PingER data will be convert into RDF format. Then the data pre-processor will take care of converting RDF/XML into N-triples serialization formats using N-triples convertor module. This N-triple file of an RDF graph will be as an input and stores the triples in storage as a key value pair using MapReduce jobs"

Potential projects

See list of Projects

Coordinates of team members:

See: http://pinger.unimas.my/pinger/contact.php

  • No labels