This page describes, how the data collected by the SLAC PingER site, ends up as information, which is used by pingtable, motion charts, intensity maps and several other applications. Various scripts are used in turn to generate the data. The scripts in their turn are driven by a trscrontab that executes on pinger@pinger.slac.stanford.edu.
Stores raw data on nfs at:
/nfs/slac/g/net/pinger/pinger_mon_data/ping-<YYYY>-<MM>.txt |
At SLAC stores raw data on nfs at:
/nfs/slac/g/net/pinger/pinger2/data/ping-<YYYY>-<MM>.txt e.g. /nfs/slac/g/net/pinger/pinger2/data/ping-2011-02.txt |
Since 2004 it stores old files as zipped File at
/nfs/slac/g/net/pinger/pingerdata/hep/data/<host>/ping-<YYYY>-<MM>-<DD>.txt.gz e.g. /nfs/slac/g/net/pinger/pingerdata/hep/data/pinger.slac.stanford.edu/ping-2011-03-22.txt.gz /nfs/slac/g/net/pinger/pingerdata/hep/data/pcgiga.cern.ch/ping-2006-09-28.txt.gz |
Data from 1997-2003 can be found in /nfs/slac/g/net/pinger/pingerdata/. The files are zipped and compressed and contain the gathered data for 100 and 1000 byte pings for each day for all monitors.
157cottrell@pinger:~$ls -l /nfs/slac/g/net/pinger/pingerdata/ total 11 drwxrwsr-x 2 6995 iepm 512 Dec 2 2011 1997/ drwxrwsr-x 2 6995 iepm 1024 Jan 25 2005 1998/ drwxrwsr-x 2 6995 iepm 1024 Jan 25 2005 1999/ drwxrwsr-x 2 6995 iepm 1024 Aug 3 2007 2000/ drwxrwsr-x 2 6995 iepm 512 Jan 25 2005 2001/ drwxrwsr-x 2 6995 iepm 512 Jan 25 2005 2002/ drwxrwsr-x 38 6995 iepm 1536 Feb 8 2007 2003/ drwxrwsr-x 6 iepm iepm 512 Jul 20 15:19 hep/ drwxrwsr-x 3 pinger iepm 512 Mar 8 2012 new/ drwxrwsr-x 4 6995 iepm 512 Jan 25 2005 oldftp/ 158cottrell@pinger:~$ls -l /nfs/slac/g/net/pinger/pingerdata/2002/ total 3844288 -rw-r--r-- 1 6995 iepm 349112320 Jan 25 2005 data-2002-01.tar -rw-r--r-- 1 6995 iepm 316467200 Jan 25 2005 data-2002-02.tar -rw-r--r-- 1 6995 iepm 332656640 Jan 25 2005 data-2002-03.tar -rw-r--r-- 1 6995 iepm 326103040 Jan 25 2005 data-2002-04.tar -rw-r--r-- 1 6995 iepm 347064320 Jan 25 2005 data-2002-05.tar -rw-r--r-- 1 6995 iepm 324648960 Jan 25 2005 data-2002-06.tar -rw-r--r-- 1 6995 iepm 319150080 Jan 25 2005 data-2002-07.tar -rw-r--r-- 1 6995 iepm 320245760 Jan 25 2005 data-2002-08.tar -rw-r--r-- 1 6995 iepm 336117760 Jan 25 2005 data-2002-09.tar -rw-r--r-- 1 6995 iepm 335669760 Jan 25 2005 data-2002-10.tar -rw-r--r-- 1 6995 iepm 303858176 Jan 25 2005 data-2002-11.tar -rw-r--r-- 1 6995 iepm 323323904 Jan 25 2005 data-2002-12.tar 168cottrell@pinger:~$cp /nfs/slac/g/net/pinger/pingerdata/2002/data-2002-01.tar /nfs/slac/g/net/pinger/pingerdata/hep/data.unite/ $cd /nfs/slac/g/net/pinger/pingerdata/hep/data.unite/ $tar -xvf /nfs/slac/g/net/pinger/pingerdata/hep/data.unite/data-2002-01.tar cache01.ansp.br/ping-2002-01-01.txt.gz cache01.ansp.br/ping-2002-01-02.txt.gz ... yumj2.kek.jp/ping-2002-01-30.txt.gz yumj2.kek.jp/ping-2002-01-31.txt.gz 182cottrell@pinger:/nfs/slac/g/net/pinger/pingerdata/hep/data.unite$ls 172.23.52.7/ monitor.seecs.edu.pk/ pinger.cdacmumbai.in/ pingerlhr-pu.pern.edu.pk/ aup.seecs.edu.pk/ moore.ece.rice.edu/ pinger.cemb.edu.pk/ pingerpwr.pern.edu.pk/ ... 177cottrell@pinger:/nfs/slac/g/net/pinger/pingerdata/hep/data.unite$cp yumj2.kek.jp/ping-2002-01-31.txt.gz /tmp/ 178cottrell@pinger:/nfs/slac/g/net/pinger/pingerdata/hep/data.unite$gunzip /tmp/ping-2002-01-31.txt.gz 180cottrell@pinger:/nfs/slac/g/net/pinger/pingerdata/hep/data.unite$tail /tmp/ping-2002-01-31.txt yumj2.kek.jp 130.87.34.37 ultra.edu.uy 164.73.128.70 100 1012520610 10 10 379.815 381.366 384.206 0 1 2 3 4 5 6 7 8 9 380.758 380.938 382.205 380.480 380.878 384.206 379.815 381.087 382.547 380.743 yumj2.kek.jp 130.87.34.37 frcu.eun.eg 193.227.1.1 100 1012520610 10 10 364.870 399.626 446.021 0 1 2 3 4 5 6 7 8 9 398.877 395.008 388.069 396.147 416.314 392.862 398.349 399.744 364.870 446.021 |
Note that after the above, the raw input data to wrap-analyze-hourly.pl (see below) for 1998..2003 comes from /nfs/slac/g/net/pinger/pingerdata/hep/data.unite/ rather than /nfs/slac/g/net/pinger/pingerdata/hep/data/.
There is a script /afs/slac/package/pinger/pre2004.pl that will take the data in /nfs/slac/g/net/pinger/pingerdata/<1998..2003> copy it, unzip, and untar into /nfs/slac/g/net/pinger/pingerdata/hep/data.unite/.
There is a second script /afs/slac/package/pinger/pre2004-hourly.pl that using /afs/slac/package/pinger/analysis/analyze-all.pl reads the raw data from /afs/slac/g/net/pinger/pingerdata/hep/data.unite/ and analyzes and stores the hourly data for the selected years.
The first script to be executed is wrap-analyze-hourly.pl (which is executed by calling the wrapper analyze-all.pl --date 1days from the trscronjob) which takes as input data the output of getdata.pl and from this aggregates the data to by day and writes the latest to the /nfs/slac/g/net/pinger/pingerreports/hep/<metric>/ directory with the file name <metric><size><by><yyyy><mm>-<dd>.txt.gz. The analyze-hourly.pl script is run daily from the trscrontab on pinger and by default analyzes the data gathered for yesterday.
Example output filename for the minimum_rtt metric:
/nfs/slac/g/net/pinger/pingerreports/hep/minimum_rtt/minimum_rtt-100-by-node-2006-09-28.txt.gz |
By default the above file is created once thus the directory appears as:
57cottrell@pinger:~>ls -l /nfs/slac/g/net/pinger/pingerreports/hep/minimum_rtt/minimum_rtt-100-by-node-2011-05* -rw-rw-r-- 1 pinger iepm 492144 May 2 02:17 /nfs/slac/g/net/pinger/pingerreports/hep/minimum_rtt/minimum_rtt-100-by-node-2011-05-01.txt.gz -rw-rw-r-- 1 pinger iepm 545968 May 3 02:17 /nfs/slac/g/net/pinger/pingerreports/hep/minimum_rtt/minimum_rtt-100-by-node-2011-05-02.txt.gz -rw-rw-r-- 1 pinger iepm 561661 May 4 02:17 /nfs/slac/g/net/pinger/pingerreports/hep/minimum_rtt/minimum_rtt-100-by-node-2011-05-03.txt.gz -rw-rw-r-- 1 pinger iepm 566550 May 5 02:17 /nfs/slac/g/net/pinger/pingerreports/hep/minimum_rtt/minimum_rtt-100-by-node-2011-05-04.txt.gz -rw-rw-r-- 1 pinger iepm 537127 May 6 02:17 /nfs/slac/g/net/pinger/pingerreports/hep/minimum_rtt/minimum_rtt-100-by-node-2011-05-05.txt.gz -rw-rw-r-- 1 pinger iepm 538830 May 7 02:17 /nfs/slac/g/net/pinger/pingerreports/hep/minimum_rtt/minimum_rtt-100-by-node-2011-05-06.txt.gz -rw-rw-r-- 1 pinger iepm 488360 May 8 02:17 /nfs/slac/g/net/pinger/pingerreports/hep/minimum_rtt/minimum_rtt-100-by-node-2011-05-07.txt.gz -rw-rw-r-- 1 pinger iepm 499020 May 9 02:17 /nfs/slac/g/net/pinger/pingerreports/hep/minimum_rtt/minimum_rtt-100-by-node-2011-05-08.txt.gz -rw-rw-r-- 1 pinger iepm 563840 May 10 02:17 /nfs/slac/g/net/pinger/pingerreports/hep/minimum_rtt/minimum_rtt-100-by-node-2011-05-09.txt.gz -rw-rw-r-- 1 pinger iepm 583454 May 11 02:17 /nfs/slac/g/net/pinger/pingerreports/hep/minimum_rtt/minimum_rtt-100-by-node-2011-05-10.txt.gz -rw-rw-r-- 1 cottrell iepm 577949 May 12 22:08 /nfs/slac/g/net/pinger/pingerreports/hep/minimum_rtt/minimum_rtt-100-by-node-2011-05-11.txt.gz -rw-rw-r-- 1 cottrell iepm 102 May 12 17:25 /nfs/slac/g/net/pinger/pingerreports/hep/minimum_rtt/minimum_rtt-100-by-node-2011-05-12.txt.gz |
Example output format. Following the 1st line in the file there is 1 line like the following per day/per host pair. Between the initial and final src_name and tgt_name tokens there are 24 tokens one for each hour of the day, missing data is identified by a dot followed by a space (. ), e.g.:
icfamon.dl.ac.uk lns62.lns.cornell.edu 108.871 . . . . . . . . 108.892 . . . . . . . . . . . . . 109.620 icfamon.dl.ac.uk lns62.lns.cornell.edu |
The first line in the file contains a label for each of the time slots (e.g. hours):
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
The remaining analyze scripts (wrap-analyze-daily.pl, wrap-analyze-monthly.pl, wrap-analyze-allmonths.pl, and wrap-analyze-allyears.pl) take as input the data from wrap-analyze-hourly.pl, wrap-analyze-daily.pl, and wrap-analyze-allmonths.pl and create files of the form:
/nfs/slac/g/net/pinger/pingerreports/hep/<metric>-<size>-by-<site|node>(-<YYYY>?)(-<mm>?)(-<dd>?).txt.gz /nfs/slac/g/net/pinger/pingerreports/hep/<metric>-<size>-by-<site|node>-<60|120|365>days.txt.gz /nfs/slac/g/net/pinger/pingerreports/hep/<metric>-<size>-by-<site|node>-<allmonths|allyears>.txt.gz |
There are ~ 16 metrics:
<option value="MOS">Mean Opinion Score</option> |
<option value="alpha">Directivity</option> |
<option value="average_rtt" selected>Average Round Trip Time</option> |
<option value="conditional_loss_probability">Conditional Loss Probability</option> |
<option value="duplicate_packets">Duplicate Packets</option> |
<option value="ipdv">Inter-Packet Delay Variation</option> |
<option value="iqr">Inter-Quartile Range</option> |
<option value="maximum_rtt">Maximum Round Trip Time</option> |
<option value="minimum_packet_loss">Minimum Packet Loss</option> |
<option value="minimum_rtt">Minimum Round Trip Time</option> |
<option value="out_of_order_packets">Out of Order Packets</option> |
<option value="packet_loss">Packet Loss</option> |
<option value="throughput">TCP Throughput (kbits/s)</option> |
<option value="unpredictability">Ping Unpredictability</option> |
<option value="unreachability">Ping Unreachability</option> |
<option value="zero_packet_loss_frequency">Zero Packet Loss Frequency</option> |
Information on these can be found at http://www.slac.stanford.edu/comp/net/wan-mon/tutorial.html.
The script requires a configuration file, which contains entries for all the reports prm should create. These entries are of the form:
<metric name> <monitoring site> <country||continent> <tick> (<filter>?) |
/nfs/slac/g/net/pinger/pinger_mon_data/ping-<YYYY>-<MM>.txt contains data from 2005-2010. /nfs/slac/g/net/pinger/pinger2/data/ping-<YYYY>-<MM>.txt contains data from 2009-2012. /nfs/slac/g/net/pinger/pingerdata/ contains data from 1997-2003 /nfs/slac/g/net/pinger/pingerdata/hep/data/ contains data 1997-2007 |
See here
The total PingER data volume is about 550Gbytes.
We estimate that there are about 60 GBytes of uncompressed hourly data for 100 Byte pings by node, as of September 2014. One would estimate to about quadruple that if one added 1000 byte pings and by site. See Volume of PingER data Sep 2014.
See Archiving PingER data by tar for retrieval by anonymous ftp
We have spotted anomalies between the values reported by:
They are discussed and explained in the Anomalies report.
I get the error message below:
Your "cron" job /afs/slac/package/pinger/analysis/wrap-analyze-daily.pl --basedir /nfs/slac/g/net/pinger --usemetric --dataset hep --by by-site --size 1000 produced the following output: Thu Dec 3 05:00:02 2015 Warning /nfs/slac/g/net/pinger/pingerreports/hep/minimum_rtt/minimum_rtt-1000-by-site-2015-12-01.txt.gz does not exist |
Since the trscronjob analyze-all.pl --date 1days only reads and analyzes the most recent day's raw data if the job fails to run, then next days there will be missing data and you wilr get the above message. To recover the missing daily data run analyze-all.pl --date 2015-12-01 for the missing date (in this example 2015-12-01).
On 3/9/2012, we requested unix-admin@slac.stanford.edu to backup /nfs/slac/g/net/pinger/ on a regular basis. This was added to the nightly backup by Andrew May.