Time & date
Wednesday Apr 8th 2015 9:00pm Pacific Standard Time (note change of time for California due to daylight savings), Thursday Apr 9th 2015 9:00am Pakistan time, Thursday Apr 9th 2015 noon Malaysian time, Thursday Apr 9th, 2015 02:00am Rio Standard Time.
Attendees
Invitees:
Hassaan Khaliq, Kashif, Raja, Samad Riaz- (SEECS); Johari-, Nara, Adnan Khan- (UNIMAS); Abdullah, Badrul, Anjum+, Ridzuan, Ibrahim+ (UM); Hanan, Saqib- (UTM); Adib+, Fatima+ (UUM); Fizi Jalil+ (MYREN); Les+, Bebo+ (SLAC)
+ Confirmed attendance
- Responded but Unable to attend:
Actual attendees:
Anjum, Adib, Fatima, Hafizi, Bebo, Les. Tried calling Ibrahim but he did not appear to be online.
Administration
- Membership of pinger-my in https://groups.google.com
How is the measurements, analysis and paper on GeoLocation coming along?
He has been looking at the alpha (directivity) behavior, there was an exponential behavior, but it was unclear how to take advantage of it. Anjum cut N. America into regions to facilitate improving the accuracy of the alpha prediction. He believes this will improve things.
We are not sure Fatima is on the pinger-my email list. Adib will look at this. According to Membership of pinger-my Fatima has access as of Feb 27, 2015. Les sent out a test 4/9/2015.
Johari has got the OK from the conference organizing committee to hold a colocated PingER/BigData workshop on August 3rd the day before the CITA 2015 (see http://www.cita.my/ an International Conference 4th - 6th August 2015, on transforming Big Data into Knowledge. Johari will provide relevant information to Bebo. Bebo will be able to make a presentation. Les has sent Bebo some relevant slide decks. Johari is awaiting an abstract from Bebo. Also Ridzuan or Ibrahim or Renan have interest in submitting a full paper by April 2nd 2015. Any update? Anjum reports that Ibrahim has submitted at least one paper. Apparently there are 3 papers from UM based on PingER. Anjum will follow up with Dr. Abdullah to see if more information is available
UFRJ
Maria Luiza reported for the last meeting:
' I had a meeting with the students last week, trying to get some feedback just after they had returned from the Carnival holidays. We have still not solved the infrastructure problem (it does not depend on us, as we have the money but not the autonomy to hire someone to work on the network and electrical installations), but we are now trying to move to a different location.
I asked them for a specific report on some of the issues you mentioned and we should get back with this to you till Tuesday.
I thought we could make an offer to students that are already in the US with a Science without Border grant, for a period at SLAC working on one of the current solutions, just like Renan did. I asked Renan if there is a list where we could try to make such announcement, but I am also trying to contact the person in charge for UFRJ Computing students abroad. Of course, we cannot guarantee it will be another dedicated "Renan", but we would try to select a student with some background on the subject.'
Cristiane has studied the PinGER data and how to cast it into Linked Open Data form. The size of the PingER hourly data for 1998-Sep 2014 archived via FTP in text form amounts to ~ 5.12GB and this corresponds to 15.66*10^9 (billion) triples. Then using 5 triples for each measurement and using Turtle without compression gives us 685 Gbytes or an inflation factor of ~ 200.
I do not know if it is reasonable to compress the data and then only uncompress it before use or does this break the search or make it incredibly slow. If it is reasonable to compress then it might be interesting to know how well the data compresses (actually its probably useful to know any for infrequent activities such as archiving).
Ibrahim says "It is reasonable to compress for transmission purpose, however, it should uncompressed in the local or cloud storage, so that load and indexing for processing can be easy".
Bebo is not concerned about the bloat it is to be expected.
Christiane's report is at: Size Inflation of PingER Data for use in PingER LOD
UUM
pinger.uum.edu.my is down Jan 31st, Feb 1st. Adib reports:
"Unfortunately, UUM pinger is still down. I have contacted UUM computer center staff several times and they have promised to fix the issue from their side, yet nothing!
We have semester break next week, I will go and meet them in person. Hopefully, can help to fix this issue."
Adib has been discussing with Anjum looking at potential PingER projects. Adib has a master student. In particular they are interested in providing more flexible access to PingER data rather than the limited time windows pingtable.pl provides. This may be by providing database access rather than using flat files. Les has provided:
- Several emails exchanged between Les and the student at UUM (Fatima Binta Adamu) to clarify access and the format of the PingER data. I think it is clear now. Documentation has been updated.
- Any update?
Fatima Binta Adamu of UUM has been added to the list. Fatima has submitted her propsal. Her defence is scheduled on 20/April. She is proposing an indexing strategy which will enable data to be retrieved based on monitoring sites and date. She was initially looking at hadoop/MapReduce but is moving to sparq.
She appreciates any suggestion from the team. It may be also useful for Fatima to share her presentation with others and with the UFRJ people (mluiza.campos@gmail.com, Cristiane Ceia (cristianeceia@gmail.com), Renan F. Souza (renanfs@cos.ufrj.br)). Christiane's report is at Size Inflation of PingER Data for use in PingER LOD. After the meeting Les sent Fatima a link to the Cristiane's report and also introduced Cristiane and Fatima to one another.
Adib is involved in NETAPPS 2015. It is a forum for scientists, researchers, students, and practitioners from all over the world to present their latest research results, ideas, and developments in the area of Future Internet and discuss advancement of next generation networks. More details about NETAPPS2015 can be found in the following link: http://internetworks.my/netapps2015/v2/index.php. Included in the advisory committee are: Bebo, Anjum, Les. the conference is in December in KL. There was a discussion of how to engage PingER. There is a track on Internet protocols and service. This was agreed to be a good place to present some PingER papers on Internet measurements and a Malaysian Case Study. Saqib started a case study. It is at: https://drive.google.com/folderview?id=0B-NEKleLll79ZFNmUnhiVGJ0Nmc&usp=sharing_eid. It is incomplete and needs updating. Anjum will find a new PhD student to look at this.
UM
Ibrahim is conducting experiments on using Big data with PingER data and writing a conference paper. He plans to complete the middle of March. Did it get completed? He will send a copy for review before submitting. He now has a cloud of 4 computers at UM running Hadoop and 15 at MYREN. The installs are done, he has downloaded the PingER data from the SLAC ftp site and is about to start on the mining.They are trying to get around hadoop MapReduce. He has 3 or 4 systems and is trying to convert from the PingER text format to PDI format
UNIMAS
Johari had no updates 2/4/2015, 3/4/2015, 4/8/2015 except that he will be presenting a poster next week at our UNIMAS R&D Expo regarding PingER project.
His top priorities are:
- Reviving the Raspbery Pi. Johari went to the computer center but could not figure out what was wrong with the raspberry Pi. It may be a firewall issue or he may need to replace the SD card.
- Getting the research student going on anomalous behaviour detection methods. He is working on a Conference paper.
Johari still has to uncover the problem of the traceroute from UNIMAS. UDP has been unblocked. The MYREN host works fine and share most of the hops. Thus the problem must be in the first few hops.
From previous meetings
The two major issues with the Raspberry Pi would be:
- are the results statistically the same as for the other monitor at UNIMAS (e.g. use the Kolmogorov-Smirnov test); There is Advanced Project (Master by coursework student) working on the statistics of the data from the raspberry Pi and the production PingER monitor at UNIMAS to see how much they differ.
- is it reliable/robust is it clear what to do to debug problems remotely (e.g. if it is at Bario). Looking at the monitoring data I have been unable to collect any from it (it is pingable, and port 80 responds, however the remote traceroute and ping_data.pl are not working) since Oct 20th which does not sound promising. Will need to evaluate the robustness of the unit by doing simulated scenario of various events such as power failure, hard and cold reboot, etc. Johari will need access to computer center to verify it comes up correctly after reboot etc.
- Johari will go to the computer center the coming weekend and look at improving the auto re-start.
If/when it works it would be instructive to look at the data from pinger and raspberry pi to Malaysia since the distances are shorter and the differences may show up better. For Sep-Oct 2014 when there was data measured from both Oct-Nov the averages for 20 paths was 52+-21ms (from pinger.unimas.my to 20 other Malaysian hosts) and 56+-21ms for raspberry pi to 20 other Malaysian hosts.
UTM
Saqib was unable to attend 4/8/2015. He will send an update by email.
Saqib has updated the case study and is available in Google drive as a "Shared-PingER" document for review at https://drive.google.com/folderview?id=0B-NEKleLll79ZFNmUnhiVGJ0Nmc&usp=sharing_eid
The traceroute problem regarding maximum reachable hops ( i.e. 11 hopes ) may be since the Unix/Linux/OSX traceroute uses UDP to send the requests. The first request is sent to a particular port (33434), with a ttl to tell it how many hops to go to. The ttl starts at 1 is incremented as it tries the next hop, also the port is incremented (up to 33465). It looks like the first few UDP ports are enabled and then they are blocked. The Windows traceroute uses ICMP to send the probes so does not see the problem.
MYREN
Fizi reports that the MYREN PingER host at UTM was reported down on 22 March, it came back up again the next day.
He has been reading up about the PingER project.
- A goal would be to compare and contrast the benefits of perfSONAR and PingER.
NUST
The following hosts are sometimes unable to be found via DNS when using ping:
Host | Time PDT |
---|---|
pinger.uob.edu.pk | Apr 8, 11:01am, Apr 11, 11:00am |
pinger.usindh.edu.pk | Apr 6, 1:04am, Apr 5, 1:05am Apr 4, 1:02am |
pinger.isra.edu.pk | Ap4 6, 1:04am Apr 4, 1:02am |
pinger.kohat.edu.pk | Ap4 4, 1:02am |
So far this month we have been unable to gather any data from the following Pakistani hosts:
- airuniversity.seecs.edu.pk
- cae.seecs.edu.pk
- ns3.pieas.edu.pk
- nuisb.seecs.edu.pk
- pinger-ncp.ncp.edu.pk
- pinger.cemv.edu.pk
- pinger.lcwu.edu.pk
- pinger.nca.edu.pk
- pinger.numl.edu.pk
- pinger.pern.edu.pk
- pingerisl-air.pern.edu.pk
- pingerisl-fjwu.pern.edu.pk
- pingerlhr.pern.edu.pk
- pingerqta.pern.edu.pk
- sau.seecs.edu.pk
- www.upesh.edu.pk
At the last meeting Samad was working on:
- www.upesh.edu.pk
- pingerlhr.pern.edu.pk
- pinger.pern.edu.pk
- ns3.pieas.edu.pk
- pinger.cemb.edu.pk
- pinger.nca.edu.pk
Anjum proposed to Samad that he contact Dr Adnan Iqbal at the naml node in Bolochistan to get PingER installed there. After the previous meeting email was sent to Adnan who is happy to do this, but will need to get upper management approval. Dr. Adnan Iqbal has contacted Samad about the PingER Installation 4/8/2015. The concerned person will provide Samad Access to the PingER server 4/19/2015 (According to his words). Samad shall Install it 4/19/2015 if he gets the Access to that node and will update us about the status. Les sent email to Adnan who said Samad and the Namal staff are in contact. Samad responded that he is hoping the concerned person will contact Samad.
For the latest see: http://www-iepm.slac.stanford.edu/monitoring/checkdata/Jan.htm
Pinger at SLAC
Working on the following hosts to be able to gather data:
Host | State | last seen | traceroute.pl |
---|---|---|---|
hunnas.learn.ac.lk | emailed 2/26/2015 | Nov 13, 2014 | Works |
web.hepgrid.uerj.edu | emails 12/2/2014, 12/8/2014, 2/26/2015 | Oct 23, 2014 | Works |
www.umss.edu.bo | emails 8/30,2014, 9/12/2014, 11/27/2014, 2/27/2015 | Jul 6, 2014 | No response |
pinger.sesame.jo | email 3/14/2015. Fixed 3/16/2015. | Mar 4, 2015 | Works |
pinger.stanford.edu | email 3/14/2015 | Feb 18, 2015 | Works |
pinger.fnal.gov | email 3/21/2015 | Mar 18, 2015 | Talked with Phil Demar of FNAL 5/3/2015. |
pingersonar-utm.myren.net.my | email 3/21/2015, Fixed 3/23/2015 | Mar 9,2015 | No Response |
The virtual machine pinger2.pl measurement agent at SLAC now has a fixed IP address. As a consequence it can only monitor SLAC hosts. It is running successfully and gathering data. It appears to respond in the same way as the floating IP VM, i.e. te RTT from bare metal to VM fixed address = 0.045+-0.02ms > VM fixed address to bare metal. This is within the error bars of the floating address result. See PingER VM Comparative analysis of significant statistical difference with non VM. This has been reported to the VM developer (Nebula). The Nebula startup has now folded.
Bebo arranged a meeting with the Colombia RENATA NREN folks and the minister of IT to discuss the use of PingER in Colombia. There is a web page at: Colombia. Les has sent an email asking them to install pinger2.pl at at least one site in Columbia. Sent a reminder email 2/27/2015. Bebo will send a gentle reminder to the RENATA people of Columbia to see whether they continue to be interested and need a meeting.
Next meeting
Next meeting: Wednesday May 6th 2015 9:00pm Pacific Standard Time, Thursday May 7th 2015 9:00am Pakistan time, Thursday May 7th 2015 noon Malaysian time, Thursday May 7th, 2015 02:00am Rio Standard Time.
Old Items
Linked Open Data
Feb 2015
The plan is still the one seen before (see project proposal), experimenting those alternatives. Right now, they managed to triplify the data according to a new ontology that takes advantage of a combination of a current standard for multidimensional data (called data cube vocabulary) and a revised version of Renan's Moment ontology adaptation. With this we expect to have a better data organization than the previous solution.
They are now preparing a test plan (like a small benchmark) to be used on all alternatives so that we can compare the results accordingly.
Aug 2014
Renan finished the new pingerlod web site. The new thing is that it should be much easier now to modify the info texts. What Renan did was to put the texts into a separate file. The new version has been loaded on the server and some text added to describe how to use the map. However there is a bug that prevents it from executing the map. Renan reports that the bugs should be easy to fix. He has talked to his professor who suggested trying RDF Owlink, it should have faster responses to queries. Renan will research this. It will probably mean reloading the PingER data so is a lot of work, hopefully this will improve performance. Before the rebuild he will make the fixes and provide a new WAR for us to load on pingerlod.slac.stanford.edu. He is also working on documentation (he has finished the ontology and has a nice interactive tool for visualizing it, since the ontology is the core of the data model of our semantic solution, this will be very helpful for anyone who uses our system, both a developer of the system and a possible user) and his thesis. Bebo pointed out that to get publicity and for people to know about the data, we will need to add pingerlod to lod.org.
Things he will soon do regarding documentation:
- A task/process flow writing all java classes involved on all those batch jobs;
- A Javadoc <http://www.oracle.com/technetwork/java/javase/documentation/index-jsp-135444.html> which will explain all classes and how they are used.
For the Linked Open Data / RDF which is in pre-alpha days, you can go to http://pingerlod.slac.stanford.edu. As can be seen this page is not ready for prime time. However the demos work as long as one carefully elects what to look at:
- Click on Visualizations, there are two choices:
- Multiple Network Metrics: Click on the image: gives a form, choose from Node pinger.slac.stanford.edu pinging to www.ihep.ac.cn, time parameters yearly, 2006 2012, metrics throughput, Average RTT Packet loss and display format Plot graph, then click on submit. In a few seconds time series graph should come up. Mouse over to see details of values at each x value (year).
- A mashup of network metrics x university metrics Click on image: gives another form, pinging from pinger.slac.stanford.edu, School metric number of students, time metric years 2006 2012, display format plot graph, click on submit. Longer wait, after about 35 seconds a google map should show up. Click on "Click for help." Area of dots = number of students, darkness of dots = throughput (lighter is better), inscribing circle color gives university type (public, private etc.) Click on circle for information on university etc.
- Renan will be working on providing documentation on the programs, in particular the install guide for the repository and web site etc. This will assist the person who takes this over.
Renan is using OWLIM as RDF Repository. He is using an evaluation version right now. Renan looked into the price for OWLIM (that excellent RDF Database Management System he told us about). It would cost 1200EUR minimum (~ 1620 USD, according to Google's rate for today) for a one time eternal license. It seems too expensive. No wonder it is so good. Anyhow, he heard about a different free alternative. Just not sure how good it would be for our PingER data. He will try it out and evaluate. He will also get a new evaluation of the free OWLIM lite.
He has also made some modifications on the ontology of the project (under supervision of his professor in Rio) hence he will have to modify the code to load the data accordingly.
Maria and Renan are advancing in some approaches to deal with PingER data, making it easier to be analyzed and integrated. In particular they have been busy studying and evaluating alternatives, analyzing results from the latest benchmarks on NoSQL (including RDF and graph based storages) database management, distributed processing and mediated solutions over relational databases, and also other experiments with multidimensional analyses on Linked Data. The new students involved are now understanding better the scenario and they have been interacting with Renan regularly.
UM
Moved here 3/4/2015:
Ibrahim has setup distributed hadoop clusters. He has 2TB of disk space. Les has provided information on getting a subset of PingER data by anonymous ftp via ftp://ftp.slac.stanford.edu/users/cottrell. It was put there last September. Information on how the data was put together is at https://confluence.slac.stanford.edu/display/IEPM/Archiving+PingER+data+by+tar+for+retrieval+by+anonymous+ftp. There is information on formatting etc at http://www-iepm.slac.stanford.edu/pinger/tools/retrievedata.html and some on the dataflows at https://confluence.slac.stanford.edu/display/IEPM/PingER+data+flow+at+SLAC. Renan at UFRJ has successfully used this data, he has also characterized the data in terms of bytes/metric per year etc.
Ibrahim has started downloading all zip files in the local machines. 6 weeks ago he downloaded 2 GB of Weather data to test his nodes cluster, he wrote a simple Java program (Map, Reduce) to find the Average and it was working fine.
NUST
The following is from Samad 2/24/2015.
- buitms.seecs.edu.pk #We have to disable gathering data from this host because the person still don't want to continue with us as i have tried once again to convince him but the answer is same. Les has disabled from SLAC.
- nukhimain.seecs.edu.pk # We were unable to gather data since 20th November, 2014 and now the Node is working fine and collecting data as well.
- pinger.uettaxila.edu.pk #The node is working fine from last two weeks.
- sau.seecs.edu.pk. #This Node is working fine now.
- pingerjms.pern.edu.pk #This node is working now.
- pinger.uet.edu.pk # this was also not working from so many days. and now its working fine and collecting data as well.
- pinger.isra.edu.pk # This node is also working fine now.
- pingerlhr-pu.pern.edu.pk # This is also working fine now.
- pinger.kohat.edu.pk # Collecting data now.
The IP of "pingerqta.pern.edu.pk" has been changed, Les has updated the databas at SLAC with the following
Old IP: 121.52.157.157
New IP: 121.52.157.148
Tulip
Follow up from workshop
- Hossein Javedani of UTM is interested in anomalous event detection with PingER data. Information on this is available at https://confluence.slac.stanford.edu/display/IEPM/Event+Detection. We have sent him a couple of papers and how to access the PingER data. Hossein and Badrul have been put in contact. Is there an update Badrul?
The Next step in funding is to go for bigger research funding, such as LRGS or eScience. Such proposals must lead to publications in high quality journals. They will need an infrastructure such as the one we are building. We can use the upcoming workshop (1 specific session) to brainstorm and come up with such proposal. We need to do some groundwork before that as well. Johari will take the lead in putting together 1/2 page descriptions of the potential research projects.
- Need to identify a few key areas of research related to PingER Malaysia Initiative and this can be shared/publicized through the website. These might include using the infrastructure and data for: anomaly detection; correlation of performance across multiple routes; and for GeoLocation. Future projects as Les listed in Confluence herehttps://confluence.slac.stanford.edu/display/IEPM/Future+Projects can also be a good start and also Bebo's suggestion.
- Need to synchronize and share research proposals so as not to duplicate research works. how to share? Maybe not through the website, or maybe can create a member only section of the website to share sensitive data such as research proposal?
Anjum suggested Saqib, Badrul and Johari put together a paper on user experiences with using the Internet in Malaysia as seen from Malaysian universities. In particular round trip time, losses, jitter, reliability, routing/peering, in particular anomalies, and the impact on VoIP, throughput etc. It would be good to engage someone from MYREN.
Ibrahim
Ibrahim Abaker is planning to work on a topic initially entitled " leveraging pingER big data with a modified pingtable for event-correlation and clustering". Ibrahim has a proposal, see https://confluence.slac.stanford.edu/download/attachments/17162/leveraging+pingER+big+data+with+a+modified+pingtable+for+event-correlation+and+clustering.docx. Ibrahim reports 7/15/2014 "I have spent the last few months trying to understand the concept of big data storage and its retrieval as well as the traditional approach of storing RDF data. I have integrated a single hadoop cluster in our cloud. but for this project we need multiple clusters, which I have already discussed with Dr. Badrul and he will provide me with big storage for the experiment." No Update 8/20/2014.
"I have come up with initial proposed solution model. This model consists of several parts. The upper parts of the Figure below shows the data source, in which PingER data will be convert into RDF format. Then the data pre-processor will take care of converting RDF/XML into N-triples serialization formats using N-triples convertor module. This N-triple file of an RDF graph will be as an input and stores the triples in storage as a key value pair using MapReduce jobs"