Skip to end of metadata
Go to start of metadata

Motivation

Pinger data explorer is primarily used to visualize the trends in the Internet end-to-end performance statistics measured over 200 countries from +90 active PingER monitoring hosts. Having gathered data since 1998, the explorer graphs enable the users to study the trends, step changes, significant improvements/degradations.

The data can be viewed in 4 different ways: Motion bubble chart, Motion Map, Motion Histogram and Line Chart.

The different sets of charts (w.r.t. regions) enables the users to study the progress made by countries in comparison to their neighbours as well as the world in general.

Below are some examples of the Motion bubble chart:

Default view

Highlighted countries with
bubble size showing internet users


 

Usage

By default the pinger data explorer shows the bubble chart between throughput in log scale to average_rtt in log scale. Each bubble represents a country. The size of the bubble shows population of the country. The color of the bubble shows the region of the country. The motion is the time. The time granularity is by the year. Click on the play button to initiate motion.

  • On the left, there are two panels: PingER Visual Landscape and Compare by Country
  • PingER Visual Landscape shows the metrics which are measured by PingER and which can be compared to other metrics or indices of the world.
  • Click on a metric name, a sub heading of the metric will appear. Click the sub heading, a menu will open which will allow you to choose the metric to be displayed at X-axis, Y axis; or to color the bubbles on the basis of this metric; or to size the bubbles on the basis of this metric.
  • Other PingER metrics that are available include minimum_rtt, ipdv, packet_loss, internet users, population, n throughput, MOS, Ping unreachability, Corruption Perception Index (CPI), Digital Opportunity Index (DOI), Human Development Index (HDI), ICT Development Index (IDI) and internet penetration index.
  • A log scale or a linear scale can be chosen for X and Y axis by going to the chart options (top right), a drop down menu appears which allows you to select default colors; allows you to set same bubble size for all bubbles; and allows you to show the trails.
  • The lower left panel allows you to compare countries within a selected region or with all the countries of the world.
  • In the lower left panel, below compare by country, and next to regions there is a drop down menu of colors. This allows you to choose countries to be compared in a region or with all the regions; and also allows you to color the bubbles according to regions. By default it is set to compare all the countries of the world.
  • Move the cursor over a bubble to identify the values for the bubble.
  • Move the cursor over a colored legend symbol to blink the bubbles associated with the symbol's color.
  • To follow the changes in detail simply move the slider bar with the mouse.
  • One can select a bubble by clicking on it to provide a label for the bubble that persists with the motion. The trails check box leaves a trail to follow the motion of selected bubbles. 
  • By clicking the link at the right top of the page, one may select the vantage point, the observed region and the granularity of the measurements.
  • Only one axis (or metric) can be chosen for Motion map, Motion histogram and Line chart. Line chart requires countries of interest to be selected to view the line chart of that metric with respect to years as it does not have motion feature.

Implementation Details

HTML Output

To create a motion chart, the data must be populated in a data structure (as stated by Google public data explorer  and copied below, the complete dataset can be found here ):

The data must comply with the formatting requirements of Google data explorer and mentioned below:

  • The column headings in the first line of the data file must exactly match the concept id and the property id of the concept with which the data is associated (though order may vary).
  • Each row must have exactly the same number of elements as the number of properties on the concept (even if the value is empty).
  • Each value for the concept's id field (here, the country code) must be unique and non-empty (an empty field is one with zero or only whitespace characters).
  • Values for properties that reference other concepts must either be empty or be a valid value of the referenced concept.
  • Values that contain the comma character must be represented without comma; for example 23,400 must be represented as 23400.

The complete folder that was uploaded to Google Explorer is available here .

Relevant Files

The scripts and files are placed at /afs/slac.stanford.edu/package/pinger/explorer . Two scripts have to be run for generating the data file in the format required by the google data explorer. First script is generate-metric-files-for-explorer.pl . This script takes metric values from prmout folder (http://www-iepm.slac.stanford.edu/pinger/prmout/ AKA /afs/slac/g/www/www-iepm/pinger//prmout (you can use ls -lt /afs/slac/g/www/www-iepm/pinger/prmout/ | grep SLAC | grep allyear | more to find the latest relevant files, see below) and transposes the data such that years are now incremented vertically and not horizontally.

ls -lt /afs/slac/g/www/www-iepm/pinger//prmout/ | grep SLAC | grep allyear | more
-rw-rw-rw- 1 cottrell sf 17108 Jan 4 09:26 unreachability-EDU.SLAC.STANFORD.N3-country-allyearly.csv
-rw-rw-rw- 1 cottrell sf 18886 Jan 4 09:23 ipdv-EDU.SLAC.STANFORD.N3-country-allyearly.csv
-rw-rw-rw- 1 cottrell sf 18524 Jan 4 09:20 packet_loss-EDU.SLAC.STANFORD.N3-country-allyearly.csv
-rw-rw-rw- 1 cottrell sf 21477 Jan 4 09:16 minimum_rtt-EDU.SLAC.STANFORD.N3-country-allyearly.csv
-rw-rw-rw- 1 cottrell sf 18099 Jan 4 09:13 alpha-EDU.SLAC.STANFORD.N3-country-allyearly.csv
-rw-rw-rw- 1 cottrell sf 15112 Jan 4 09:06 MOS-EDU.SLAC.STANFORD.N3-country-allyearly.csv
-rw-rw-rw- 1 cottrell sf 30900 Jan 4 08:48 nthroughput-EDU.SLAC.STANFORD.N3-country-allyearly.csv
-rw-rw-rw- 1 cottrell sf 21426 Jan 4 08:30 throughput-EDU.SLAC.STANFORD.N3-country-allyearly.csv
-rw-rw-rw- 1 cottrell sf 21561 Jan 4 08:27 average_rtt-EDU.SLAC.STANFORD.N3-country-allyearly.csv
-rw-rw-rw- 1 pinger sf 3359 Jan 4 08:08 nthroughput-EDU.SLAC.STANFORD.N3-continent-allyearly.csv
-rw-rw-rw- 1 pinger sf 2190 Jan 4 08:02 minimum_rtt-EDU.SLAC.STANFORD.N3-continent-allyearly.csv
-rw-rw-rw- 1 pinger sf 1837 Jan 4 07:58 ipdv-EDU.SLAC.STANFORD.N3-continent-allyearly.csv
-rw-rw-rw- 1 pinger sf 2196 Jan 4 07:53 average_rtt-EDU.SLAC.STANFORD.N3-continent-allyearly.csv
-rw-rw-rw- 1 pinger sf 1253 Jan 4 07:49 MOS-EDU.SLAC.STANFORD.N3-continent-allyearly.csv
-rw-rw-rw- 1 pinger sf 1754 Jan 4 07:43 unreachability-EDU.SLAC.STANFORD.N3-continent-allyearly.csv
-rw-rw-rw- 1 pinger sf 2225 Jan 4 07:39 throughput-EDU.SLAC.STANFORD.N3-continent-allyearly.csv
-rw-rw-rw- 1 pinger sf 1824 Jan 4 07:34 packet_loss-EDU.SLAC.STANFORD.N3-continent-allyearly.csv
-rw-rw-rw- 1 pinger sf 1485 Jan 4 2014 alpha-EDU.SLAC.STANFORD.N3-continent-allyearly.csv

The above files are updated using the command:

 /afs/slac/package/pinger/analysis/wrap-analyze-allyears.pl --basedir /nfs/slac/g/net/pinger --usemetric --dataset hep  --set_metric 4

that is run quarterly from a trscontab file, thus you should not need to do anything.

The files generated by generate-metric-files-for-explorer.pl  are:

326cottrell@rhel6-64i:~$ls -lt /afs/slac.stanford.edu/package/pinger/explorer
total 639
-rw-rw-r-- 1 amberzeb sg 36293 Jan 4 09:37 MOS.csv
-rw-rw-r-- 1 amberzeb sg 42913 Jan 4 09:37 average_rtt.csv
-rw-rw-r-- 1 amberzeb sg 40238 Jan 4 09:37 ipdv.csv
-rw-rw-r-- 1 amberzeb sg 42829 Jan 4 09:37 minimum_rtt.csv
-rw-rw-r-- 1 amberzeb sg 51327 Jan 4 09:37 nthroughput.csv
-rw-rw-r-- 1 amberzeb sg 39876 Jan 4 09:37 packet_loss.csv
-rw-rw-r-- 1 amberzeb sg 40887 Jan 4 09:37 throughput.csv
-rw-rw-r-- 1 amberzeb sg 35982 Jan 4 09:37 unreachability.csv

These are then given as an input to the script generate-alldata-for-pinger-data-explorer.pl,  which outputs the data for all the metrics altogether in one file named file.csv. This file is in the format as required by the google data explorer.

The metric files placed in prmout, for example, average_rtt-EDU.SLAC.STANFORD.N3-country-allyearly.csv have the data in the format shown below:

The script generate-metric-files-for-explorer.pl converts the metric files in the format below:

With all the metric files in the above format, a file has to be generated with data for allmetrics altogether as below:

This is the format required by Google data Explorer. The file in above format is generated by running the script generate-alldata-file-for-pinger-data-explorer.pl which takes the files generated by generate-metric-files-for-explorer.pl as input files and outputs a file named 'file.csv' with all the metric data in the format required by Google Data Explorer.

It is important to note here that the CPI, IDI, HDI, DOI and Internet Penetration data has been taken from Motion chart data files (demographics.csv) placed at /afs/slac.stanford.edu/package/pinger/motion-chart.

Implementation

Following is the algo of implementation:

Updating Data Files

The files have to be generated manually and uploaded manually. There is no crontab that is running these scripts to generate files and the folder to be uploaded. These files need to be updated each year. Once the data set is uploaded, it checks for errors. If there are no errors then you can preview the data and once it is up to the mark, publish the dataset.

Miscellaneous Details

  • Tool: The interactive graph was generated using the Google Public Data Explorer.
  • Data: The data presented here was collected by the PingER  project, processed by prmout. The same data is available in tabular format. The statistics of Population and internet users were acquired from the World bank , the country to region mapping was obtained by the geographical database maintained by the PingER project, the Corruption Perception Index (CPI) was taken from Transparency International and extracted from Wikipedia , the Digital Opportunity Index (DOI) is obtained from ITU's ICT Statistics , Human Development Index (HDI) is obtained from UNDP Human development reports and ICT Development Index (IDI) is obtained from ITU Reports .
  • Please note that while all the statistics were acquired over several years (i.e. since 1998), the Internet usage statistics were documented in Nov. 2007.
  • Loading: The time to load and render the web page is largely determined speed of the link, the file length (about a MByte that is determined by the number of metrics and frequency of data points and the number of metrics) and the speed of the client rendering.
  • Metrics: Average RTT (ms), Normalized Throughput (Kbps), Throughput (Kbps), Internet Users (#), Population (#), Minimum RTT (ms), Packet Loss (%), Unreachability (%), IPDV (ms), MOS, IDI, HDI, DOI, CPI and Internet penetration Index.
  • Now the data for Pinger Data Explorer is under pinger.slac@gmail.com (May/2016).
  • Authors: Faisal Zahid & Amber Zeb 29/8/2011. Idea champion: Faisal Zahid.

Problems

For some countries stars or asterisks appear instead of bubble. This is because of no data available for the size of the bubble. For example if there is no data for population of Chile but there is data for its metrics, then the bubble would be replaced by an asterick and the metric values would be shown in motion like normal.

  • No labels