SLAC SEECS Status report June 12th 2012

Attendees

Anjum, Umar, Asad, Sadia, Zafar, and Les. Ghulam and Amber were not present

General

Asad will be joining us from now on. He is finishing his course work in a couple of months. Les has sent information to Asad, who has filled out the SLAC Users form and now has a SLAC ID (SID). Next step is to fill out the forms for a computer account, scan them and send to Les. Then we will need to get the privileges (Confluence, AFS, NFS etc.) set up. Les has also sent Asad a list of documents to look over.

Maggie was not working due to no disk space. Anjum gave the go ahead to clean /tmp/. Joun was away he will see if he can get it going again (5/29/2012). No update.

Anjum has 2 permanent places. He may use one of them to take over support for the new database PingER archive. More in a couple of weeks. Sadia stressed the need for her to work with someone to pass on the information to. This seems to conflict with "Sadia proposed to hire another student to complete the work of Bilal and Ghulam. Anjum says, after June he can choose an undergraduate student who can take this as his FYP project." This is probably a note taker error, Anjum please resolve.

Zafar will be in Pakistan until August. At that time he will be going to Sweden for a year for his Rasmussen scholarship MS.

It appears everyone has UPS so we do not have to worry too much about power outages in our meetings. We agreed to set up a regular schedule for meetings. I believe the regular time is Wednesdays at 8pm Pacific time.

PingER

In general the history looks good. The changes from satellite to terrestrial can be seen. the rerouting of the Japanese to W. Coast US can also be seen (2000-2001). There is a slight improvement overall with time.  For intra-regions such as Europe, N. America, Pakistan For TULIP it would be good to measure the optimum (median) alpha so we could use it instead of the single alpha value we use today. For example a quick look at Europe to Europe in pingtable directivity gives alpha=0.344(.235-0.439), for N. America 0.491 (0.407, 0.558), and Pakistan 0.163 (0.075,0.338). The numbers in parentheses are the 25% and 75%.  Some filtering of the data may be needed to remove anomalous values such as hosts with wrong lat/longs, hosts measuring themselves (and very small distances so mainly network device delays).

PingER Archive Site - Sadia, Zafar

Zafar had met with Ghulam before the meeting. They have modified the primary key of the meta data table to add the ping size to ensure uniqueness. We defined the goal as being able to use modified PingER graphical presentations to report on perfSONAR data. We agreed to focus on the schema for the aggregated shards, Given the size of the database shards (monthly or weekly look like they are needed) possibly leaving the raw data in flat files for the meantime.We agreed it is important to keep a copy of the raw data especially if we have ro re-analyze the data for say a new metric, or correct an analysis etc.  Zafar shared a modified schema. The additions needed to the perfSONAR database to accomodate PingER are identified in italics. The perSONAR only metrics are identified in bold face. The  RTTs and Sequence numbers are each in a single columns with the individual entries separated by commas.

The last work done by Ghulam was to save raw pings like old pinger schema meaning concatenating pings for one hour. However we had decided , it would be storing all 48 pings half hour pings/day. So this is also left to be done. Sadia will prepare documentation and upload all information provided by Ghulam. Sadia has requested Ghulam to comment his code

HEC Report - Anjum, Amber and Imdad

Amber;is working on this report. The basic structure is the same as was of the last 6 monthly report. She will be sharing the report in a few days. Amber was unable to attend the meeting so there was no update.

Pakistani Hosts

There are some Pakistani nodes that are recorded as working from SEECS while they are not working from SLAC. This mismatch was recorded by Amber and Joun. Amber and Joun will get together and see if the problem still exists or not. Did anything happen?

  • Les looked at which Pakistani monitoring hosts SLAC is unable to gather data from since the start of May. They are as follows:
    • hu.seecs.edu.pk aka PK.HU.EDU.N2 does not appear as a monitoring host in SEECS pingtable.pl. Also SEECS pingtable.pl has no data this month from AIOU to hu.seecs
    • lse.seecs.edu.pk aka PK.LSE.EDU.N3 does not appear as a monitoring host in SEECS pingtable.pl. Also SEECS pingtable.pl has no data this month from AIOU to lse
    • pinger-itc.pu.edu.pk aka PK.PU.EDU.N2 does not appear in the SEECS pingtable.pl as a monitoring host
    • pinger.giki.edu.pk aka PK.GIKI.EDU.N1 does not appear as a monitoring host in SEECS pingtable.pl
    • pinger.uaar.edu.pk aka PK.UAAR.EDU.N1 does not appear as a monitoring host in SEECS pingtable.pl
    • pinger.ustb.edu.pk aka PK.USTB.EDU.N2 does not appear as a monitoring host in SEECS pingtable.pl
    • sbkwu.seecs.edu.pk aka PK.SBKWU.SEECS.EDU does not appear as a monitoring host in SEECS pingtable.pl
  • Should they be designated as Monitoring hosts?
  • Amber reports: Looking at this weeks status report on wiki, the list of the nodes which are not monitoring nodes in SEECS pingtable but are monitoring nodes at SLAC can be due to the fact that these nodes are the usual problematic nodes since months. It might be possible that SEECS team (who manage SEECS pingtable) have made these nodes as remote nodes to avoid hassle. However, she will confirm this from Kashif tomorrow and will update you on this. If this turns out to be true, she will change these nodes to remote nodes in SLAC pingtable as well.

FSBD  POP have high unreachability values, which is not acceptable. They are looking into it. Backhaul network is currently leased from PTCL however in 3-4 months they will replace it with their own network. There would be no commercial traffic on it. As a result it is expected that RTT and losses will improve drastically. So next 6 months are important for observing the network performance.

Amber was not at the meeting so there was no update.

TULIP - Bilal

Asia as a whole looks bad, Pakistan looks good (i.e. CBG is way better than GeoIP). 

For South Asia, quest.seecs.edu.pk has high RTT and a distance of 816km for Islamabad nodes which is because the node is in Nawabshah Karachi but has the DNS entry of SEECS. Similarly, sbkwu.seecs.edu.pk has high error and large distance because the node is in Quetta but DNS entry is of SEECS. As Anjum can see, all the nodes that are showing bad results are the ones that have unstable behavior (i.e either they are unreachable most of the time or they have high RTTs).

Bilal needs to add the number of landmarks in the region for his Excel Spreadsheest. This will be useful for his paper, i.e. reporting typical accuracy as a function of landmarks in region.

Another interesting study would be whether using different alphas (in distance[km]=alpha*min_RTT[ms]*100[km/ms]) based on the alphas found in PingER for the various regions (see for examplehttp://www-wanmon.slac.stanford.edu/cgi-wrap/pingtable.pl?file=alpha&by=by-node&size=100&tick=daily&year=2012&month=03&from=United+States&to=United+States&ex=none&only=all&dataset=new&percentage=any) provides much benefit compared to the single current value of alpha. To facilitate this we have added PingER groups for N.AMERICA, EUROPE, AUSTRALASIA, S.ASIA, S.AMERICA.

Bilal will be sending the tulip draft paper before he leaves the job. 

Bilal will be submitting a document on what he did and what needs to be done in his work. Sadia will then review it.

Bilal will rerun the stress testing of North America using new reflex. He has submitted it*. Did this happen or is it dead?*

Possible projects

  • There can be a paper about Pinger if we could just find the right conference. MCN, ICC and Globecomm do provide network monitoring topics. It could talk of the various metrics and their importance (in particular; MOS, Alpha, max RTT, min RTT), the lessons learnt from running such a worldwide infrastructure, the uses of the data etc.
  • We can talk of GEO-Location experiences. For example within Pakistan it works fine, however as we go within regions or continents this gets worse. We can publish some stats on that for example. We can add the impact of changing alpha. We can also indicate the importance of landmark proximity. 
  • See [https://confluence.slac.stanford.edu/display/IEPM/Future+Projects].
  • Extend the NODEDETAILS data base to allow entry support for whether the host is currenty pingable. 
  • Improve the PingER2 installation procedures to make it more robust. This might be something for the person(s) in Pakistan who are responsible for installing PingER2 at the Pakistani monitoring sites. They probably have found where the failures occurs. Also look at the FAQ, and ping_data.pl which has been improved to assist in debugging, could it be further improved (e.g. provide access to the httpd.conf file so one can see if it properly configured)? There are 2 students working on the PingER archive. Is this something they could work on?
  •  [Fix PingER archiving/analysis package to be IPv6 conformant|IEPM:Make PingER IPV6 compliant]. Will build a proposal for an IPv6 testbed. They will try various transition techniques. A proposal has been prepared and that has been submitted to PTA. Adnan is a co PI. It is being evaluated today.  A small testbed has been established in SEECS and the plan to shift some of the network to IPv6. Bilal is part of 3 students involved with PingER and they will be involved with IPv6. They are porting the PingER archive site site to using a database. They have redeveloped the archive site using Umar's documentation. They have set up a small test archive site. They have gathering, archiving, analysis. They will design a new database. They will also try a port of PingER to IPv6.
  • Look at RRD event detection based on thresholds and how to extend, maybe adding plateau algorithm. Umar's algorithm did  not work in a predictable manner. 
  • Provide near realtime plots of current pinger data using getdata_all.pl/wget. It will work as a CGI script with a form to select the host, the ping size, and the time frame to plot. It will use wget or getdata_all.pl to get the relevant data and possibly RRD/smokeping to display the data.

Future meeting time - Les

Next meeting will be on Wednesday 20th June, 2012 at 8pm in US and Thursday 21st June, 2012 at 8am in Pakistan.

  • No labels