Authors: R. Les Cottrell SLAC; Sadia Rehman, Amber Zeb NUST; Zafar Gilani SLAC & NUST; Umar Kalim Virginia Tech

Deployment

Monitoring Hosts

In 2010, following a series of workshops and site visits, the team at NUST and SLAC worked with Pakistan’s Education and Research Network (PERN) and Pakistani Universities to put together an end-to-end (E2E) network monitoring infrastructure for PERN connected higher education sites. By the end of 2010 they had have installed the PingER monitoring tools and started gathering data at 18 sites in Pakistan. This includes 4 sites (NUST, COMSATS, PERN and NCP/Quaid-i-Azam) which have been in place for a longer time. In addition they are working on a further 8 monitoring sites. Over 2010 the number of monitoring host – remote host pairs (both in Pakistan) has increased from about 30 to over 220.

Pakistan is divided into five regions: Islamabad, Lahore, Peshawar, Quetta and Karachi. These regions are chosen on a geographical basis and the hosts included in each region are all in close proximity. It is expected that hosts in one region show a similar behavior while accessing hosts in another region. This helps in analyzing which region has better infrastructure and connectivity.

As the map below shows, Pakistan’s backbone is laid out from North to South and West to East. Peshawar region connects directly to Islamabad; Islamabad connects directly only to Lahore apart from Peshawar; Lahore accesses Karachi via Faisalabad and Quetta can only access rest of the country via Karachi.


Map of the PERN backbone. Source: the Higher Education Commission, Pakistan.

Map of PERN2. Source: http://pern.edu.pk/images/home/download/hld.png

Map of PERN2 for the Islamabad region: Source: http://pern.edu.pk/images/lapopsislamabad.jpg

Map of Islamabad area with rough guesses at locations of the PoPs. F=FJWU (chaklala 3), B= NUST, C=AIOU, D=NDC, E=QAU. The distances are: FJWU-NUST=20km, NUST-AIOU=9.3km, AIOU-QAU=17km, QAU-FJWU=30km. It is assumed the fibres roughly follow the roads.

The difficulty of the installations has varied from site to site. The technical installation of the software etc. has been simple and not resulted in delays. The delays between deciding to install a monitoring host and gathering measurements from the host have mostly been due to: machine availability; getting administrative approval within university;getting access to the concerned local people, and delays in making the DNS record entry. Note we still do not have a DNS entry for the  Lahore School of Economics, and had to enhance our software tools to accommodate this. Problems once it starts taking data are poor power availability, lack of backup power and access to the site when there are problems that needs physical access. The table below shows the history of the installation and remarks concerning the difficulty of installation and the reliability of the host:

Table of Pakistani PingER Monitoring sites, the URLs of the monitoring hosts the date of installation and first data gathering, the location and remarks on installation and reliability.

Monitoring Site

URLs

Installed (Month/year)

1st Data Gathered (Month/year)

City

State

Remarks

School of Electrical Engineering and Computer Sciences, NUST

monitor.niit.edu.pk

1/2005

2/2005

Islamabad

Punjab

Reliable

COMSATS University

pinger.comsats.edu.pk

2/2007

3/2007

Islamabad

Punjab

Had Temporary power problems, but now OK

Pakistan Education Research Network PERN, Islamabad

pinger.pern.edu.pk

4/2007

5/2007

Islamabad

Punjab

Reliable

NCP, Quaid-e-Azam University

pinger-ncp.ncp.edu.pk

4/2007

5/2007

Islamabad

Punjab

Reliable

UET at Lahore

pinger.uet.edu.pk

7/2009

8/2009

Lahore

Punjab

Reliable

International Islamic University at Islamabad

vle.iiu.edu.pk

4/2010

4/2010

Islamabad

Punjab

Reliable

Lahore School of Economics

111.68.102.40 (we still do not have a DNS registration)

4/2010

8/2010

Lahore

Punjab

Reliable, but unable to get DNS

University of Balochistan

pinger.uob.edu.pk

4/2010

4/2010

Quetta

Balochistan

Power problems

University of Arid Agriculture at Rawalpindi

pinger.uaar.edu.pk

6/2010

9/2010

Islamabad

Punjab

Lack of understanding

UET at Taxila

pinger.uettaxila.edu.pk

6/2010

6/2010

Islamabad

Punjab

Power for PingER machine, lack of understanding

Agriculture University of Peshawar

pinger.aup.edu.pk

6/2010

6/2010

Peshawar

Khyber Pakhtunkhwa

Power & backup problem

UET at Peshawar

pinger.nwfpuet.edu.pk

6/2010

6/2010

Peshawar

Khyber Pakhtunkhwa

Power & Backup problems

NED University of Engineering & Technology

npm.neduet.edu.pk

8/2010

8/2010

Karachi

Sindh

Reliable

Allama Iqbal Open University

pinger.aiou.edu.pk

10/2010

11/2010

Islamabad

Punjab

Sub-netting problems

Punjab University, ITC, Department

pinger-itc.pu.edu.pk

10/2010

11/2010

Lahore

Punjab

Reliable

National College of Arts

pinger.nca.edu.pk

10/2010

12/2010

Lahore

Punjab

Reliable

Hazara University at Mansehra

pinger.hu.edu.pk

10/2010


Peshawar

Khyber Pakhtunkhwa

Machine unavailability

KUST at Kohat

pinger.kohat.edu.pk

10/2010

12/2010

Peshawar

Khyber Pakhtunkhwa

Power, Backup & understanding

FAST at Peshawar

pinger.pwr.nu.edu.pk

10/2010

12/2010

Peshawar

Khyber Pakhtunkhwa

Power, understanding

University of Science & Technology at Bannu

pinger.ustb.edu.pk

10/2010


Peshawar

Khyber Pakhtunkhwa

Power, understanding

FAST at Lahore

pinger.lhr.nu.edu.pk

11/2010

12/2010

Lahore

Punjab

Power & Backup

University of Sindh at jamshoro

pinger.usindh.edu.pk

12/2010

12/2010

Karachi

Sindh

Reliable

Isra University at Hyderabad

pinger.isra.edu.pk

12/2010

12/2010

Karachi

Sindh

Reliable

Lahore College for Women University

pinger.lcwu.edu.pk

12/2010

12/2010

Lahore

Punjab

Reliable

Pakistan Education Research Network PERN, Quetta

pingerqta.pern.edu.pk

12/2010

12/2010

Quetta

Balochistan

Concerned employee on workshop/office in Islamabad, Will stable soon

In addition work is in progress to install: BUITMS in Quetta, Bolochistan; Sindh Agriculture University Tandojam in Karachi, Sindh; NCEMB University in Lahore, Punjab; Virtual University in Lahore, Punjab; Air University in Islamabad, Federal Territory; Kinnaired College for Women in Lahore, Punjab.

The growth in the number of monitors and host pairs being monitored over 2010, is seen below (spreadsheet ):

Map of sites

The locations of the Pakistani monitoring (red) and remote(red and blue) hosts are seen in the PingER maps below.



Pakistani PingER sites with lines showing the minimum RTT in msec. seen from NUST

Details of PingER sites in Islamabad, the lines show the average RTT seen from NUST

Details of PingER sites in Islamabad, the lines show the minimum RTT from PERN

It is interesting to see (right hand map above) the large differences in minimum  RTT between PERN and say NCP (at Quaid i Azam university in N.E. Islamabad) of < 10msec (blue line or more exactly 1.3msec) and that between PERN and NUST in the S.W. corner of between 40-80msec (red line or more exactly 44msec.). Presumably this is due to the routing of PERN connections in Islamabad region.

Also interesting to see (middle map above) is that the connection from NUST to the DSL site (blue line) is much less than that to the physically closer NUST host. This host (lo-0-gw.dsl.net.pk(203.82.63.254)) is connected via a Micronet broadband connection rather than most of the other hosts which are connected via Nayatel to the PERN backbone. Nayatel in turn owns Micronet. The route from SEECS to the DSL host goes via Nayatel as far as we can trace it. Thus the link from SEECS to PERN which goes via Nayatael may go to a Nayatel Point of Presence at the same location as the DSL host and then to the PERN metropolitan ring. This may be the cause for the lower minimum RTT to the DSL host.

Mr. Muhammad Zeeshan contacted PERN for information regarding exact routing paths. Mr. Jawad from PERN-HEC pointed out that currently PERN relies on PIE network infrastructure for inter-city communication. However they are laying their own infrastructure (cable, routers) and will be able to use their own dedicated links within 4-7 months. PIE usually forwards traffic through Karachi for now. This means that traffic to Quetta goes via Karachi (as also indicated by our traceroute). It must be noted that PERN is a federated environment connecting all major universities of Pakistan. It must also be noted that the intra-city routes are mostly dedicated. Though all three PingER nodes of NUST-SEECS are connected via Nayatel.

Archive, analysis site

IN 2010, a second instance of the SLAC archive-analysis site was set up at NUST. This provides backup for data and access, and improved performance for Pakistani users. The NUST hosts are are connected (see the traceroute) to PERN via Nayatel.

Measurements

Using PingER, the monitoring hosts ping each remote host with 10 pings every 30 minutes. From this data we are able to measure minimum and average Round Trip Times (RTT), jitter, loss, unreachabilty (all 10 pings fail) and derive throughput and Mean Opinion Score (MOS). The data is gathered from the monitoring sites on a daily basis by the archiving  sites at NUST, SLAC and FNAL.

Results

Unreachability

A host is considered unreachable if none of the pings sent to it are responded to.  To illustrate this we chose a reliable host at SLAC  (pinger.slac.stanford.edu) and analyzed the unreachability of Pakistani hosts seen from SLAC.

Table of unreachability seen from SLAC to Pakistani hosts in 2010. Higher values (bad) are colored redder. The data is sorted by increasing unreachability in Jan 2011. Spreadsheet

Chart of the unreachability of Pakistani hosts seen from SLAC Dec 2010 and Jan 2011

Smokeping examples of unreachabillty seen from SLAC for 120 days Oct 2010 - Jan 2011.

It is seen that several hosts exhibit high unreachability. The reasons behind the high unreliability are usually site specific and vary from lack of reliable power and a source of backup power, floods, lack of access to the site when there are problems that require physical access and some stand out due to lack of understanding of the problem.

Minimum RTT

The minimum of the RTTs measured between Islamabad PingER hosts for each month for 2010 is seen below


Minimum monthly RTT measured between Islamabad region PingER Hosts from December 2009 through January 2011. Outliers are colored yellow. Green indicates when there was a change from month to month that persisted.  Spreadsheet .

 It is seen that there were several major changes in minimum RTT between the months of June and December 2010. Most of these resulted in an increase in minimum RTT. Some of this was caused by moving connections for a site from one PERN core router to another (e.g. COMSATS), or to a campus or host physically moving from one site to another (e.g. IIU, SEECS).

Patterns shown by min RTT

 There was a big improvement in the min RTT to IIU from all sites between Sep and Oct 2010. The reason is IIU's shift to PERN network. It's IP is 111.68.97.162 (IP addresses starting with 111 and 121 are on PERN network) so this confirms that it is now on PERN network. The reason for higher min RTT was basically due to IIU using the public/service-provider network. A screenshot of pingtable shows an observation of this change.

NEDUET (Karachi) from PERN (Islamabad) and other sites is < 30ms which is less than PERN to NIIT (Islamabad). This is due to the fact that NIIT (Islamabad) is not on PERN network. It uses public network (Nayatel to be precise). Only IPs that start with 111 and 121 (e.g. of the form 111.nnn.nnn.nnn and 121.nnn.nnn.nnn) are on PERN network. Further trends are shown below in the table. Spreadsheet here.

Trends:

  • LHR to LHR is good (green).
  • PWR/NWFP to LHR is good (green). PWR and NWFP belong to the PWR (Peshawar) region. This traffic goes via ISB region.
  • ISB shows miscellaneous patterns and this is due to nodes being on different networks and following different routes:
    • Nodes in ISB that are on PERN network show good performance (green). These are: pinger.pern.edu.pk (121.52.147.253), vle.iiu.edu.pk (111.68.97.162), pinger-ncp.ncp.edu.pk (111.68.99.142), www.pieas.edu.pk (111.68.99.199) and pinger.uaar.edu.pk (111.68.99.248).
    • Nodes not on PERN network show higher RTT (orange and red): SEECS maggie1, maggie2, monitor nodes and pinger.comsats.edu.pk (203.124.40.43).
  • We do not fully understand why pinger.comsats.edu.pk (203.124.40.43) has lower min RTT to pinger.nwfpuet.edu.pk (121.52.148.71) and pinger.pwr.nu.edu.pk (121.52.148.110) than vle.iiu.edu.pk (111.68.97.162) or maggie1.niit.edu.pk (115.186.131.81) since COMSATS and NIIT are on public network and NWFPUET, PWR and IIU are on PERN network. However some explanations that come to mind are:
    • COMSATS node experiences higher packet loss for nodes at NIIT (around 6.6%) whereas a low packet loss for NWFPUET and PWR (0.48% and 0.40%). We send 10 packets every 30 minutes so this means it loses about 32 packets every day out of 480 for NIIT and only 2 are lost for IIU, NWFPUET and PWR. This is enough to impact the average minimum RTT for the day and consequently for the entire month. Shown in figure below.
    • NIIT node is on public network and also experiences a lot of traffic. Therefore there are higher delays and thus higher min RTTs.
    • Why COMSATS has higher min RTT to IIU is an anomaly since it has lower min RTT to other PERN connected nodes such as NWFPUET and PWR.

  • NIIT has better access to NWFP than does IIU or PERN. Possible reason could be ping unreachability. Unreachability is higher from IIU (46.25%) and PERN (39.17%) as compared to NIIT (36.39%). Detailed stats here. Throughput is higher from IIU and PERN to NWFP; 5218 kbps and 7048 kbps respectively, whereas throughput from NIIT falls to 2259 kbps. Detailed stats here.
  • PWR-PWR and PWR-NWFP is good (green). PWR and NWFP belong to PWR (Peshawar) region.
  • TAX to LHR is better than TAX to ISB:
    • pinger.uettaxila.edu.pk (121.52.150.164) is on PERN network. Most nodes in Lahore are on PERN network: LSE (111.68.102.40), pinger.lcwu.edu.pk (111.68.103.135), pinger.nca.edu.pk (111.68.102.69), pinger.lhr.nu.edu.pk (111.68.102.125), pinger-itc.pu.edu.pk (111.68.103.29) and pinger.uet.edu.pk (111.68.102.14) but about half of the nodes in Islamabad such as SEECS maggie1, maggie2, monitor nodes and COMSATS are not on PERN network. However this doesn't fully explain why this is so.
    • Another inconsistency shown by stats is in inter-city routes. PERN network (shown above in Map of PERN2) is not fully deployed. In fact most of the inter-city routes aren't yet deployed. This is confirmed by the following two traceroutes of February 2, 2011. Traceroutes show that routes use public network (pie.net.pk) of Rawalpindi and Lahore telecom exchanges.
121.52.150.164 (pinger.uettaxila.edu.pk) to 111.68.102.69 (pinger.nca.edu.pk) on February 2, 2011
 3  rwp44.pie.net.pk (202.125.148.157)  7.978 ms  7.978 ms  7.937 ms
 4  lhr63.pie.net.pk (221.120.254.2)  14.655 ms  15.388 ms  15.373 ms
 5  rwp44.pie.net.pk (221.120.252.226)  14.641 ms  15.355 ms  15.847 ms
 6  lhr63.pie.net.pk (221.120.216.198)  16.697 ms  14.582 ms  17.351 ms
 7  172.31.240.34 (172.31.240.34)  14.802 ms  14.559 ms  15.279 ms
 8  172.31.252.206 (172.31.252.206)  15.270 ms  14.144 ms  98.436 ms
 9  nca.edu.pk (111.68.102.69)  98.421 ms  14.677 ms  14.650 ms
121.52.150.164 (pinger.uettaxila.edu.pk) to 111.68.102.125 (pinger.lhr.nu.edu.pk) on February 2, 2011
 3  rwp44.pie.net.pk (202.125.148.157)  12.758 ms  7.817 ms  8.080 ms
 4  lhr63.pie.net.pk (221.120.254.2)  14.264 ms  15.546 ms  15.538 ms
 5  rwp44.pie.net.pk (221.120.252.226)  14.488 ms  14.991 ms  13.693 ms
 6  lhr63.pie.net.pk (221.120.216.198)  15.735 ms  20.109 ms  49.523 ms
 7  172.31.240.38 (172.31.240.38)  50.542 ms  50.788 ms  53.875 ms
 8  172.31.252.230 (172.31.252.230)  51.799 ms  46.953 ms  42.042 ms
 9  lhr-nu.edu.pk (111.68.102.125)  14.395 ms  14.163 ms  14.391 ms
RTT and Losses for 2010



 

The average of the minimum RTT measured between regions of Pakistan between Dec-2009 and November 2010. Spreadsheet

Various percentiles for the Inter Packet Delay Variation (IPDV or jitter) between Pakistani monitoring hosts and remote host pairs. The line shows the number of pairs with measurements contributing to the results. Spreadsheet

The blue dots are the median losses seen between all pairs of monitoring and  remote hosts for each month. The error bars show the extent of the 25 and 75 percentiles. The red dots are the number of pairs contributing to the packet loss measurements. Spreadsheet

 

The minimum RTT to Peshawar, to Islamabad and to Quetta (left hand graph) appears to have reduced dramatically after April 2010. This is partially due to bringing on new hosts that have lower RTT between them. In April  there was a factor of 2 increase in the number of host pairs (this is seen in the middle and right hand graphs). Also most of the nodes shifted to the PERN network in April and May 2010. PERN network is a major dedicated network, connecting all major universities of Pakistan. The network experiences lower RTT as compared to other private ISP owned networks since usual public traffic is not present on these links. As mentioned earlier PERN is also laying its own infrastructure in order to consequently improve the connectivity, especially to more remote areas such as Quetta.

If we select the same host-pairs in both say Nov 2010 and April 2010 then the improvement ((ipdv(Apr)-ipdv(Nnov))/ipdv(Apr) in IPDV is about 47%. Thus things have improved with lower IPDVs for the selected host pairs, or in other words the improvement is not just that more recntly added hosts had lower IPDVs.

Throughput (for more information on how throughput is derived please click here).

We derive the throughput from the loss and RTT measurements as throughput = 1460*8[bits]/(RTT[msec]*sqrt(loss)) kbits/s. 

Median derived throughput (blue line) with the 25% and 75% seen from SLAC to hosts in Pakistan from Dec 2003 - Dec 2010. The number of hosts being monitored in Pakistan is seen in the brown line. Spreadsheet

Derived throughput between Pakistani region in 2010. Spreadsheet

The derived throughput seen from SLAC in the graph on the left, has increased by roughly a factor of 2 in 5 years. Within Pakistan (graph on the right) the throughput to Quetta is the poorest, followed by Karachi. Since most monitoring hosts are in the North of Pakistan, in particular in Islamabad, there are mainly long RTTs to Karachi and Quetta and hence low throughput (since throughput goes as 1/RTT).

Mean Opinion Score (MOS)

The telecommunications industry uses the Mean Opinion Score (MOS) as a voice quality metric. The values of the MOS are: 1= bad; 2=poor; 3=fair; 4=good; 5=excellent. A typical range for Voice over IP is 3.5 to 4.2 (see VoIPtroubleshooter.com). In reality, even a perfect connection is impacted by the compression algorithms of the codec, so the highest score most codecs can achieve is in the 4.2 to 4.4 range. Using the RTT, loss and jitter we derive the MOS.


Median MOS and Inter Quartile Range (IQR) between Pakistani hosts for 2010. Spreadsheet.

MOS between Pakistani regions

MOS for fixed set of Pakistani hosts by region

It is apparent that the MOS is very variable, and according to the middle graph above appears to be decreasing (getting worse) in time (see left hand and middle graphs). Some of this decrease is due to bringing on new hosts that have poorer MOS performance. If we fix on just aggregating the performance for hosts pairs that have been monitored for the whole period we get the graph on the right. This set of hosts consists of: PK.NEDUET.EDU.N1, PK.COMSATS.EDU.N2, PK.NCP.EDU.N3, PK.NIIT.EDU.N2, PK.NIIT.EDU.N7, PK.AUP.EDU.N2, PK.PERN.EDU.N1, PK.UET.EDU.N2 and PK.LSE.EDU.N3. In any case the MOS is well above the threshold of 3.5 mentioned above, so VoIP calls within Pakistan between these hosts should be successful.

Alpha

The speed of light in fibre is roughly 0.66*c (where c is the speed of light in vacuum). Using 300,000km/s as c this yields Round Trip Distance = RTD[km]=100[km/msec]*minimum_RTT[msec] as a way to derive the distance between the two hosts making the minimum RTT measurement. This assume the minimum RTT is only affected by the transmission of light in the fibre (i.e. no delays due to network devices such as routers) and that the fibre route is direct (a great circle route) between the two hosts. The use of minimum RTT is meant to eliminate most network device delays for reasonable fast circuits (e.g. at 100Mbits/s assuming no queuing the router delays is ~ 0.12msec). To accomodate these extra delays one introduces a function alpha, so that RTD[km]=alpha*100[km/msec]*minimum_RTT[msec]. Large values of alpha close to one indicate a direct path, and small values usually indicate a very indirect path. This assumes no queuing and minimal network device delays. The chart below shows the alpha values between regions in Pakistan. It is based on the minimum RTTs seen between Dec 2009 and Nov 2010.

Average Alpha measured between regions of Pakistan with the standard deviations (as error bars) and the number of host pairs contributing to the measurement. Spreadsheet

It is seen that the links between Karachi and Lahore, Karachi and Islamabad, and Karachi and Peshawar are very direct (values of alpha close to one) and are also very consistent (low values of the standard deviations). Islamabad and Quetta apparently are connected very indirectly (low value of alpha). Looking at the map at the top this makes sense since the route goes via Karachi in the South and then back northwards to Quetta. The links between Islamabad and Lahore, Islamabad and Peshawar and Lahore and Peshawar all have lower vales of alpha and thus appear to be more indirect and have higher variability. A common element in the links between these three regions is that they all pass through Islamabad (see PERN backbone map at top).

Islamabad's intra-city traffic experiences multiple hops (within a few square kms) from source to destination. Moreover outbound Islamabad traffic also experiences a slightly indirect route (multiple hops). Traffic passing between Peshawar and Lahore shows a much direct route. This can be noticed by looking at LHR-ISL, ISL-PSH and LHR-PSH alpha values. Among these three LHR-PSH is the highest (indicative of directness) despite the fact that it passes through Islamabad.

UETTaxila case study

UETTaxila showed high RTT and unreachability since the node at UET, Taxila came online. A complete case study on this can be found here.

Conclusions

  • An extensive end-to-end network monitoring infrastructure has been set up for PERN connected universities in Pakistan. Over the last year its has grown from 30 monitoring-remote node pairs to over 500 covering most of the major universities in the main regions of Pakistan. 
  • At some sites, installation and start up of monitoring hosts was delayed by weak local support. 
  • There is a great deal of variability in the reliability (unreachability) of hosts. Much of this is due to loss of power. An effort needs to be made to understand and improve power reliability and the provision of backup for several sites.
  • Given the measured MOS, VoIP tools such as Skype should work well between PERN connected hosts.
  • The poor throughput performance to Quetta is understandable. More work need to be done to understand why Karachi looks so bad.
  • The low values of alpha lead to the conclusion that there may be a lot of indirect routing in the Islamabad region. Further work with PERN is required to see if this can be remedied.
  • The configuration of the PERN network naturally changes with time. For example compare the maps: http://pern.edu.pk/images/lapopsislamabad.jpg and http://pern.edu.pk/images/home/download/hld.png where for example CIIT (COMSATS) is show as connected to the QAU router in one and the HEC router in the other. To keep an independent record of the topology we need to measure traceroutes on a regular basis and PERN needs to provide the addresses and locations of the routers.

Acknowledge

We acknowledge the patient and persistent efforts by Muhammad Zeeshan and Fahad Gilani of NUST to spearhead the installations of PingER at the various sites. Anjum Naveed and Adnan Kiani led the efforts at NUST. The PingER data was collected and analyzed by Les Cottrell of SLAC and Amber Zeb and Sadia Rehman of NUST. A few PingER tools and procedures were enhanced at NUST by Zafar Gilani. The PingER map tool was developed by Faisal Zahid while at SLAC and turned out to be extremely effective in drilling down and understanding the connections. The Smokeping tool was developed by Fahad Satti while at SLAC. Umar Kalim of Virginia Tech provided support for this year and has been spearheading the effort for the last several years. We also acknowledge the continued encouragement and support from Arshad Ali of NUST.

  • No labels