Page History

...

Since the timestamps of measurements for one MA to a target are not synchronized with another MA to the same target, they are sampling the network at different times. Thus we decided not to use the residuals in the RTTs between one pair and another. Typically the difference in the time of a measurement from say pinger.slac.stanford.edu to sitka.triumf.ca versus pinger-raspberry.slac.stanford.edu to sitka.triumf.ca averages at 8 mins (see spreadsheet ).

To find the probability of the distributions overlapping we can use a nomogram of mean differences versus error ratios given in Overlapping Normal Distributions. John M. Linacre for normal distributions. However this does not cover the range we are interested in.

We therefore used the Z Tests to compare two samples, see for example “Comparing distributions: Z test” available at http://homework.uoregon.edu/pub/class/es202/ztest.html

Spreadsheet of min/avg/max loss between and from pinger & pi to & from sitka, and to cern, plus probabilities, IPDV.

Spreadsheet of probabilities

at http://homework.uoregon.edu/pub/class/es202/ztest.html

Spreadsheet of min/avg/max loss between and from pinger & pi to & from sitka, and to cern, plus probabilities, IPDV.

Spreadsheet of probabilities

However the ping distributions are decidedly non-normal (see for example the figure below) have wide outliers, and are heavy tailed on the upper side (see https://www.slac.stanford.edu/comp/net/wan-mon/ping-hi-stat.html). This leads to large standard deviations (one to two order of magnitude greater than the IQR) in the RTT values. As can be seen from the table this results in low values of the Z-test and a false probability of no significant statistical difference. Using the IQRs of the frequency distributions instead generally leads to much higher values of the Z-test and hence a higher probability that the distributions of RTTs between two pairs of hosts are significantly different. Comparing the frequency distributions it is seen that there is indeed a marked offset in the RTT values of the peak frequencies and a resultant difference in the cumulative RTT distributions, Using the non-parametric Kolomogorov Smirnoff test (KS test) also indicated significant differences in the distributions.

It is interesting to note that measurements made from pinger.SLAC and pinger-raspberry.SLAC TRIUMF to both TRIUMF and CERN show that the average and median RTTs are about 0.4ms different even though the RTTs for TRIUMF are about 23ms rather than the 151ms for CERN. Looking at the traceroutes using Matt's traceroute to measure the RTT to each hop indicates that this difference starts at the first hop and persists for later hops. We therefore made ping measurements from each SLAC MA to its own network services via its loopback network interface.

Powerpoint of figures.

Kolmogorov-Smirnov Test

...

	Raw data - 100 Packets	Distribution - 100 Packets	Raw data - 1000 Packets	Distribution - 1000 Packets
D-stat	0.194674	0.039323	0.205525	0.194379
P-value	4.57E-14	0.551089	2.07E-14	7.32E-14
D-crit	0.0667	0.067051	0.0667	0.067051
Size of Raspberry	816	816	816	816
Size of Pinger	822	822	822	822
Alpha	0.0505i

If D-stat is greater than D-crit the samples are not considerated from the same distribution with a (1-Alpha) of accuracy. Remember that D-stat is the maximum difference between the two cumulative frequency curves.

...

Child pages

Versions Compared

Old Version 114

New Version 115

Key

Kolmogorov-Smirnov Test