Case Study NIIT Micronet DSL Link

We observed that from the TULIP landmark monitor.niit.edu.pk to nasa.nexlink.net.pk (alias lg.nexlinx.net.pk) was sometimes seeing very large RTTs (500-100ms) and for extended periods compared to those (~50ms) from other landmarks at COMSATS and NCP in Pakistan. We ran top on monitor.niit.edu.pk and the extended RTT's did not seem to be related to what is running on the host. Then we ran traceroute in the two states (~50 ms RTT and 500-100ms RTT), see below. The routes do not change but the RTTs to the second hop and beyond are badly elongated. The fact that the first hop is not elongated, I think means it is not caused by monitor itself. Thus it appears to be in the network that the congestion is occurring. In fact from the traceroutes it would appear to be most noticeable at the second hop which is labeled as DSL.

jerrod@www:~$ traceroute lg.nexlinx.net.pk traceroute to nasa.nexlinx.net.pk (202.59.80.52), 30 hops max, 38 byte packets
 1  203.99.50.201 (203.99.50.201)  0.894 ms  0.940 ms  0.985 ms
 2  lo-0-bras1.dsl.net.pk (203.82.63.253)  51.178 ms  71.427 ms  97.958 ms
 3  g9-10-iba.nayatel.pk (203.82.48.165)  47.867 ms  56.102 ms  47.368 ms
 4  58-65-175-217.nayatel.pk (58.65.175.217)  88.079 ms  49.389 ms  48.004 ms
 5  f0-gw3.nayatel.pk (58.65.166.250)  48.094 ms f1-gw3.nayatel.pk (58.65.166.254)  48.716 ms  49.531 ms
 6  rwp44.pie.net.pk (202.125.155.93)  50.012 ms  53.399 ms  52.011 ms
 7  rwp44.pie.net.pk (202.125.148.163)  50.035 ms  49.537 ms  50.021 ms
 8  pos1-1.lhe63gsrc1.pie.net.pk (202.125.159.25)  59.919 ms  63.638 ms  63.938 ms
 9  lhr63.pie.net.pk (202.125.138.169)  53.932 ms  55.435 ms  56.104 ms 10  lhr63.pie.net.pk (202.125.147.6)  56.008 ms  81.429 ms  55.995 ms
11  * * *
12  nasa.nexlinx.net.pk (202.59.80.52)  58.410 ms  69.452 ms  57.957 ms jerrod@www:~$ traceroute lg.nexlinx.net.pk traceroute to nasa.nexlinx.net.pk (202.59.80.52), 30 hops max, 38 byte packets
 1  203.99.50.201 (203.99.50.201)  0.775 ms  0.720 ms  0.837 ms
 2  lo-0-bras1.dsl.net.pk (203.82.63.253)  470.776 ms  538.449 ms  528.146 ms
 3  g9-10-iba.nayatel.pk (203.82.48.165)  556.081 ms  504.211 ms  601.765 ms
 4  58-65-175-217.nayatel.pk (58.65.175.217)  618.138 ms 58-65-175-221.nayatel.pk (58.65.175.221)  729.73
8 ms  693.730 ms
 5  f1-gw3.nayatel.pk (58.65.166.254)  766.243 ms  826.162 ms  755.736 ms
 6  rwp44.pie.net.pk (202.125.155.93)  746.828 ms  761.153 ms  800.322 ms
 7  rwp44.pie.net.pk (202.125.148.163)  826.194 ms  899.759 ms  854.293 ms
 8  pos1-1.lhe63gsrc1.pie.net.pk (202.125.159.25)  1080.374 ms  983.870 ms  1010.345 ms
 9  lhr63.pie.net.pk (202.125.138.169)  896.189 ms  905.808 ms  845.805 ms 10  lhr63.pie.net.pk (202.125.147.6)  874.788 ms  987.723 ms  940.307 ms
11  * * *
12  nasa.nexlinx.net.pk (202.59.80.52)  863.136 ms  956.115 ms  893.514 ms

Next we ran 60,000 100 byte pings, with a separation of 1 second between pings, from monitor.niit.edu to nasa.nexlink.net.pk starting at about 10:30pm February 16, Pakistan time. The summary appears as:

--- nasa.nexlinx.net.pk ping statistics ---
60000 packets transmitted, 59741 received, +33 errors, 0% packet loss, time 60077854ms
rtt min/avg/max/mdev = 55.763/92.539/2099.976/136.408 ms, pipe 3

Within the 16.66 hours of pings there were two instances of "Destination Host Uneachable". Each of these lasted about 30 seconds. I am suspicious that these were caused by DSL resyncs. The events are shown as negative losses in the time series below.

We imported the ping data into Excel and looked at the time series:

It is seen that there are many extended periods with large RTTs which are NOT associated with packets losses. A frequency histogram of the RTTs is seen below:

Note the log-log scale. Due to long-range behavior of RTTs the distribution is expected to be heavy-tailed following a straight line to the right of the peak. It obviously does not show this type of distribution, being fairly flat from 200 - 1000ms. This may be partly caused by the DSL router buffers which are typically configured to be very large (presumably to reduce packet loss). We can estimate the buffer sizes by taking the queuing time as maximum RTT - minimum RTT and the link speed from NIIT to off-site as 2 Mbits/s. which gives about 256KBytes of buffering.

Follow Up

We were fortunate to receive assistance from an operations manager at Nayatel and Micronet. Hop 2 in the traceroute is a Micronet Broadband Remote Access Server (BRAS) on which the DSL customer's [NUST/NIIT] PPPoE session is terminated. It is a Cisco 7206VXR.

PINGs and traceroute to the NASA.nexlinx.net.pk site from a Nayatel core router confirm that the long delays are not in the core netwwork.

NYT-IBA-RTR7609#trace 202.59.80.52

Type escape sequence to abort.
Tracing the route to nasa.nexlinx.net.pk (202.59.80.52)

  1 58-65-175-221.nayatel.pk (58.65.175.221) 0 msec
    58-65-175-217.nayatel.pk (58.65.175.217) 4 msec
    58-65-175-221.nayatel.pk (58.65.175.221) 0 msec
  2 f0-gw3.nayatel.pk (58.65.166.250) 0 msec 0 msec
    f1-gw3.nayatel.pk (58.65.166.254) 0 msec
  3 rwp44.pie.net.pk (202.125.155.93) 0 msec 0 msec 0 msec
  4 rwp44.pie.net.pk (202.125.148.164) 4 msec 0 msec 4 msec
  5 pos1-3.lhe63gsrc2.pie.net.pk (202.125.159.39) [MPLS: Label 453 Exp 0] 4 msec
 8 msec 4 msec
  6 lhr63.pie.net.pk (202.125.138.148) 8 msec 4 msec 8 msec
  7 lhr63.pie.net.pk (202.125.147.6) 8 msec 4 msec 8 msec
  8 nasa.nexlinx.net.pk (202.59.80.52) 8 msec 8 msec 8 msec

NYT-IBA-RTR7609# NYT-IBA-RTR7609#ping Protocol [ip]:Target IP address: 202.59.80.52
Repeat count [5]: 100
Datagram size [100]:
Timeout in seconds [2]:
Extended commands [n]:
Sweep range of sizes [n]:
Type escape sequence to abort.
Sending 100, 100-byte ICMP Echos to 202.59.80.52, timeout is 2 seconds:
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Success rate is 100 percent (100/100), round-trip min/avg/max = 4/9/32 ms NYT-IBA-RTR7609#

Child pages

Case Study NIIT Micronet DSL Link

Follow Up