Agenda of SLAC SEECS Meeting March 27th, 2012

General

Amber and Sadia's last day at SLAC is Tuesday 3/26/2012. The next meeting will probably be Tuesday April 24th, since Amber & Sadia will not be back in Pakistan until around April 18th, and Les will be on vacation in England from April 8th-22nd.

Anjum proposed that he will try to find out some MS students who might convert CBG from MatLab to Mathematica as their MS thesis. Students have to finalize their thesis by 6th April. By the second week of April, we will get to know if we have any student doing this.

Maggie is not working due to no disk space. This might be a reason for trace routes not working.

IPV6 - Anjum and Ghulam

Les has the passwords for monitor and the IPv6 host at SEECS. He has successfully logged on.  When he tries to execute a traceroute to ipv6.google.com he gets:

Executing exec(traceroute6, -m 30 -q 3, 2404:6800:4009:800::1014, 140) traceroute to 2404:6800:4009:800::1014 (2404:6800:4009:800::1014), 30 hops max, 140 byte packets
1  2001:4538:101:3::14 (2001:4538:101:3::14)  3021.987 ms !H  3021.983 ms !H  3021.971 ms !H

Les is unclear if he should see an external (to SEECS) host and has sent email.  He has had no response. He needs to retest, but has had no time.

PingER Updates

Ping_data_plot.pl and frequency.pl now work. The former provides time series of min, avg, max and losses to selected hosts seen from SLAC for the last week (including today) and also frequency distributions. They also provide forms to slect what to look at and how. See for example http://www-wanmon.slac.stanford.edu/cgi-wrap/ping_data_plot.pl?sites=www.multinet.af&begin_day=20&begin_month=3&begin_year=2012&end_day=26&end_month=3&end_year=2012

HEC Report - Anjum, Amber and Imdad

The report is in Case studies at PERN Six monthly report.
Imdadullah will be writing an executive summary of the report which needs to be submitted before the end of this month. Now the report goes to other universities as well as to the head of HEC. We now need to be very careful in writing the report.
The report was supposed to be submitted last week of February. A few changes were made in the conclusions and an executive summary. Updates?

Anjum is making final changes to the summary report. he will be sending it to HEC tomorrow. He will also send us a copy.

Pakistani Hosts

The Pakistani IPs which start with 111.68.96.xxx are routed in such a way that the data can go out but it cannot come back to Pakistan. They will change the IP of all POP nodes. This arrangement was done for NCP. Only 0.5 MB of that link is provided to other universities. Any node that is having the IP as 111.68.96.xxx will be given a new IP.

There were 12 nodes with this problem which will be solved in the next week. SEECS will give them the IP addresses, and then HEC will change the IP of these nodes.

FSBD and MTN POP have high unreachability values. which is not acceptable. They are looking into it. Backhaul network is currently leased from PTCL however in 3-4 months they will replace it with their own network. There would be no commercial traffic on it. As a result it is expected that RTT and losses will improve drastically. So next 6 months are important for observing the network performance.

HEC is changing the DNS names of these hosts, we expected to hear from them before this meeting, but have not.

PingER Archive Site - Ghulam 

Current Schema : see [here|IEPM:Pinger+PerfSonar schema].

Ghulam will modify the script to gather data half an hourly and put the analyze data in a separate SQL table. Any progress?

Sadia is working on getdata.pl to shift the data from SLAC files to database.

Future concerns:(Will be considered once  the performance of above monthly aggregated data is observed) We await a working version before this can start.

  1. How to store raw data for one year
  2. How should it be sharded
  3. For how long data should be in database
  4. Ghulam thinks it will speed up our work if we remove the unused columns (metrics) of raw data from the pingtable. Only those fields will be left that are in perfsonar.

Sadia :Adding max RTT, MOS and Alpha to pingtable.pl and the analyze scripts 

For pingtable.pl alpha value at SEECS, some links alpha was having value of 200 . Sadia will fix when she return

Sadia has modified analyze hourly, analyze daily and analyze-monthly, analyze-yearly for Max RTT, MOS & alpha. She has run the jobs in batch for all days back to 1998. She has copied the new metrics from the new daily and hourly files to the master aggregated files in the /hep folder. She has updated pinger.new.cf to add the new metrics for pingtable.

The output looks good. In particular the alpha values are very interesting, we should use these in TULIP (see below). The results may also be useful for identifying pairs of sites in Pakistan that have very indirect connections. We do have groups for KARACHI_REGION etc. that should assist in this.

Ghulam suggests to remove unrequired metrics from raw data files of pingtable.pl. Sadia thinks it wont make any difference unless number of lines being read are huge. This is not resolved.

Ghulam thinks Null values in the data table are responsible for getdata.pl  not working. We are unable to see any extra data in data table. Ghulam will look into getdata.pl again to get it working. Les sent him alpha subroutine in case he needs it. Les also sent Sadia latest version of pingtable.pl which she will send to Ghulam so that he makes it workable with getdata.pl. Ghulam sent Sadia his version of getdata.pl. 

TULIP - Bilal

Trying stress testing with reflector instead of reflex. Results are available at [Target Data for reflector tier all.|https://confluence.slac.stanford.edu/display/IEPM/Target+Data+using+reflector+with+Tier+All] The results also have  a comparison of reflex and reflector error in terms of distance. Repeat this for Europe. **Europe region stress testingis completed.

Bilal seems to be confused on the use by reflex.cgi of the tier 0 landmarks to find the region for the target.

There are many more landmarks, e.g. there are now 54 working landmarks in Europe (50 are PlanetLabs). Sadia and Les have found more landmarks in the perfSONAR database. In particular there look to be very useful landmarks in 9 countries (JP, BR, CN, TW, TH, QU, IN, Paraguay, KR) with 2 landmarks/country. Sadia has added these.

Bilal needs to add the number of landmarks in the region for his Excel Spreadsheest. This will be useful for his paper, i.e. reporting typical accuracy as a function of landmarks in region.

Another interesting study would be whether using different alphas (in distance[km]=alpha*min_RTT[ms]*100[km/ms]) based on the alphas found in PingER for the various regions (see for examplehttp://www-wanmon.slac.stanford.edu/cgi-wrap/pingtable.pl?file=alpha&by=by-node&size=100&tick=daily&year=2012&month=03&from=United+States&to=United+States&ex=none&only=all&dataset=new&percentage=any) provides much benefit compared to the single current value of alpha. To facilitate this we have added PingER groups for N.AMERICA, EUROPE, AUSTRALASIA, S.ASIA, S.AMERICA.

Bilal will be sending the tulip draft paper by the end of February. This was delayed by a family emergency.

Sadia has made sure TULIP at SLAC works for TULIP without CBG.

Bilal will rerun the stress testing of North America using new reflex.

Bilal is writing a report with stress testing results of all of the regions. He will be sharing it when it completes probably in 2-3 days.

Pinger Landmarks are down in South Asia, so he is not including South Asia in the report. He will also look into why there are less number of landmarks in South Asia.

Possible projects

  • There can be a paper about Pinger if we could just find the right conference. MCN, ICC and Globecomm do provide network monitoring topics. It could talk of the various metrics and their importance (in particular; MOS, Alpha, max RTT, min RTT), the lessons learnt from running such a worldwide infrastructure, the uses of the data etc.
  • We can talk of GEO-Location experiences. For example within Pakistan it works fine, however as we go within regions or continents this gets worse. We can publish some stats on that for example. We can add the impact of changing alpha. We can also indicate the importance of landmark proximity. 
  • See [https://confluence.slac.stanford.edu/display/IEPM/Future+Projects].
  • Extend the NODEDETAILS data base to allow entry support for whether the host is currenty pingable. 
  • Improve the PingER2 installation procedures to make it more robust. This might be something for the person(s) in Pakistan who are responsible for installing PingER2 at the Pakistani monitoring sites. They probably have found where the failures occurs. Also look at the FAQ, and ping_data.pl which has been improved to assist in debugging, could it be further improved (e.g. provide access to the httpd.conf file so one can see if it properly configured)? There are 2 students working on the PingER archive. Is this something they could work on?
  •  [Fix PingER archiving/analysis package to be IPv6 conformant|IEPM:Make PingER IPV6 compliant]. Will build a proposal for an IPv6 testbed. They will try various transition techniques. A proposal has been prepared and that has been submitted to PTA. Adnan is a co PI. It is being evaluated today.  A small testbed has been established in SEECS and the plan to shift some of the network to IPv6. Bilal is part of 3 students involved with PingER and they will be involved with IPv6. They are porting the PingER archive site site to using a database. They have redeveloped the archive site using Umar's documentation. They have set up a small test archive site. They have gathering, archiving, analysis. They will design a new database. They will also try a port of PingER to IPv6.
  • Look at RRD event detection based on thresholds and how to extend, maybe adding plateau algorithm. Umar's algorithm did  not work in a predictable manner. 
  • Provide near realtime plots of current pinger data using getdata_all.pl/wget. It will work as a CGI script with a form to select the host, the ping size, and the time frame to plot. It will use wget or getdata_all.pl to get the relevant data and possibly RRD/smokeping to display the data.
Smokeping Realtime graphs-- Amber 

The task involves three steps:

  1. Saving the latest data of the request sent. -- This required changes in connectivity.pl and getdata-all.pl
  2. Converting this latest data file to RRD file. -- This required changes in SelectConvSrcDest1.pl and PingERtoSmokeping.pm located at /afs/slac.stanford.edu/package/pinger/smokeping. Amber created a new script names PingERtoSmokeping-Amber.pl for this.
  3. Generating smokeping graph from RRD file.-- This required changes to graph.pl. Amber created a new file named graph-Amber.pl for this.

Currently, Amber is unable to test if .rrd file generated by this script is in the right format or not. She is looking into it.

Fahad worked with Amber Last week to make smokeping for latest data. They tested the code and found that the .rrd file is not generated correctly. They don't know how to test .rrd file. 

Les asked them to leave this for now because it is taking too much of time. 

Amber is now working on ping_data_plot.pl to generate gnuplots for the latest data.

Future meeting time - Les

Next meeting on Tuesday 24th April, 2012 at 8:00 pm in US and Wednesday 25th April, 2012 at 8:00am in Pakistan.

  • No labels