You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 17 Next »

Format of data etc

See PingER data flow at SLAC

Saving Analyzed data

The most likely data to be of use to others is the analyzed/aggregated data. This is kept in

/nfs/slac/g/net/pinger/pingerreports/hep/

There is a lot of data so you will need a lot of space.  I suggest you use the /tmp ditrectory as intermediate storage, do one metric at a time (e.g. average_rtt), one size (e.g. 100) and one by (e.g. by-node). If this does not work send an email to unix-admin temporary requesting space in /afs/slac/public/users/cottrell (this is accessible from anonymous FTP). Note you will need space for the copied directory and for the tar's and zipped file.  I requested 100Gbytes.

For 1 metric (average_rtt)

$mkdir /afs/slac/public/users/cottrell/average_rtt-100-by-node

$cp -v /nfs/slac/g/net/pinger/pingerreports/hep/average_rtt/average_rtt-100-by-node* /afs/slac/public/users/cottrell/average_rtt-100-by-node
#There are about 6500 files per metric. Copy takes about 20 mins per metric. 

$tar -cvzf /afs/slac/public/users/cottrell /archive-average_rtt-100-by-node.tar /afs/slac/public/users/cottrell/average_rtt-100-by-node/average_rtt-100-by-node
#A  metric takes about 6 minutes to tar and compress and each tar file occupies ~ 1.5GBytes.

For all metrics with 100Byte pings by node

 $mkdir /afs/slac/public/users/cottrell/metrics-100-by-node

$cp -v /nfs/slac/g/net/pinger/pingerreports/hep/*/*-100-by-node* /afs/slac/public/users/cottrell/metrics-100-by-node

 $tar -cvzf /afs/slac/public/users/cottrell /archive-average_rtt-100-by-node.tar /afs/slac/public/users/cottrell/average_rtt-100-by-node

However:

 $ls /nfs/slac/g/net/pinger/pingerreports/hep/*/*-100-by-node*

/bin/ls: Argument list too long.
Exit 1

To provide maximum flexibility we decided to write a script (pinger-tar.pl) to copy and tar the data. 

 The directory looks as follows:

[cottrell@pinger ~]$ ls -l /afs/slac.stanford.edu/public/users/cottrell/*.tar
-rw-rw-r-- 1 cottrell sf  512297529 Sep 11 16:33 /afs/slac.stanford.edu/public/users/cottrell/MOS-100-by-node.tar
-rw-rw-r-- 1 cottrell sf  755387730 Sep 11 17:00 /afs/slac.stanford.edu/public/users/cottrell/alpha-100-by-node.tar
-rw-rw-r-- 1 cottrell sf 1640854797 Sep 11 13:58 /afs/slac.stanford.edu/public/users/cottrell/average_rtt-100-by-node.tar
-rw-rw-r-- 1 cottrell sf  293583749 Sep 11 18:59 /afs/slac.stanford.edu/public/users/cottrell/conditional_loss_probability-100-by-node.tar
-rw-rw-r-- 1 cottrell sf  259801856 Sep 11 19:40 /afs/slac.stanford.edu/public/users/cottrell/duplicate_packets-100-by-node.tar
-rw-rw-r-- 1 cottrell sf 1018523005 Sep 11 14:27 /afs/slac.stanford.edu/public/users/cottrell/ipdv-100-by-node.tar
-rw-rw-r-- 1 cottrell sf  751314261 Sep 11 16:16 /afs/slac.stanford.edu/public/users/cottrell/iqr-100-by-node.tar
-rw-rw-r-- 1 cottrell sf 1896982390 Sep 11 18:03 /afs/slac.stanford.edu/public/users/cottrell/maximum_rtt-100-by-node.tar
-rw-rw-r-- 1 cottrell sf  297380470 Sep 11 19:53 /afs/slac.stanford.edu/public/users/cottrell/minimum_packet_loss-100-by-node.tar
-rw-rw-r-- 1 cottrell sf 1464424282 Sep 11 13:16 /afs/slac.stanford.edu/public/users/cottrell/minimum_rtt-100-by-node.tar
-rw-rw-r-- 1 cottrell sf  240217457 Sep 11 19:24 /afs/slac.stanford.edu/public/users/cottrell/out_of_order_packets-100-by-node.tar
-rw-rw-r-- 1 cottrell sf  425757104 Sep 11 15:54 /afs/slac.stanford.edu/public/users/cottrell/packet_loss-100-by-node.tar
-rw-rw-r-- 1 cottrell sf 1995419214 Sep 11 15:33 /afs/slac.stanford.edu/public/users/cottrell/throughput-100-by-node.tar
-rw-rw-r-- 1 cottrell sf  587342674 Sep 11 18:39 /afs/slac.stanford.edu/public/users/cottrell/unpredictability-100-by-node.tar
-rw-rw-r-- 1 cottrell sf  273680921 Sep 11 18:16 /afs/slac.stanford.edu/public/users/cottrell/unreachability-100-by-node.tar
-rw-rw-r-- 1 cottrell sf  269081359 Sep 11 19:15 /afs/slac.stanford.edu/public/users/cottrell/zero_packet_loss_frequency-100-by-node.tar

The compression ratio is about 3.2:1. 

To increase the number of files we could change the ping size from 100 to 1000 (in average_rtt-100-by-node* for example) and the by-node to by-site, i.e. a factor of 4.

As of 9/13/2014 there are about 20GBytes of data and 100,000 files.

Nb the following files do not gunzip:

[root@sc2u0n0 afs]# find . -name "*.gz"

./slac/public/users/cottrell/conditional_loss_probability-100-by-node/conditional_loss_probability-100-by-node-1998-02-20.txt.gz

./slac/public/users/cottrell/minimum_packet_loss-100-by-node/minimum_packet_loss-100-by-node-2004-07-18.txt.gz

I get:

268cottrell@pinger:/afs/slac/public/users/cottrell/conditional_loss_probability-100-by-node$gunzip /tmp/conditional_loss_probability-100-by-node-1998-02-20.txt.gz

gzip: /tmp/conditional_loss_probability-100-by-node-1998-02-20.txt.gz: invalid compressed data--crc error

gzip: /tmp/conditional_loss_probability-100-by-node-1998-02-20.txt.gz: invalid compressed data--length error

Retrieving

The data is available via anonymous ftp via ftp://ftp.slac.stanford.edu/users/cottrell, see here for more on retrieving the data.  For more information see: http://www-iepm.slac.stanford.edu/pinger/tools/retrievedata.html

Volume of data May , 2015. There is a script to assist with getting the data volumes at ~cottrell/bin/sumdir-regexp.pl.

Saving Raw Data

Gathered/raw data for  1998-2003 is in 

/nfs/slac/g/net/pinger/pingerdata/hep/data.unite/<host>/ping-<YYYY>-<MM>-<DD>.txt.gz

Gathered data from 2004 onwards  is saved in:

/nfs/slac/g/net/pinger/pingerdata/hep/data/<host>/ping-<YYYY>-<MM>-<DD>.txt.gz
e.g.
/nfs/slac/g/net/pinger/pingerdata/hep/data/pinger.slac.stanford.edu/ping-2011-03-22.txt.gz
/nfs/slac/g/net/pinger/pingerdata/hep/data/pcgiga.cern.ch/ping-2006-09-28.txt.gz


There is a script pinger-tar.pl. To save the raw data use the command /afs/slac/package/pinger/pinger-tar.pl. You can get help on this script by going to http://www-iepm.slac.stanford.edu/pinger/scripttable.html and choosing pinger-tar.pl from the pull down list.

 

226cottrell@pinger:~$bin/pinger-tar.pl -p true -g true | tee pinger-tar
Copying data for host(0/1/210)=drwxrwsr-x  2 cottrell iepm    11264 Jul 14 16:40 111.68.102.40 debug=-1, production=true
gzip: /afs/slac/public/users/cottrell/111.68.102.40/ping-2010-04.txt: unknown suffix -- ignored
...
159559781 bytes in /afs/slac/public/users/cottrell/111.68.102.40/111.68.102.40.tar(1/1/210)
Copying data for host(1/2/210)=drwxrwsr-x  2 pinger   iepm     1536 Jul 14 18:19 140.105.28.27 debug=-1, production=true
5041746 bytes in /afs/slac/public/users/cottrell/140.105.28.27/140.105.28.27.tar(2/2/210)

This saves about 200 tar files in ftp://ftp.slac.stanford.edu/users/cottrell. They appear as directories with the directory name being the name of the PingER monitoring agent (MA) host.  For example

The files are tar files and compressed. Together they occupy about 70GBytes compressed or about 300GBytes uncompressed.

Information on each MA can be found by going to http://www-iepm.slac.stanford.edu/pinger/pingerworld/all-nodes.cf which can be require'd in a perl script, or for currently working MAs http://www-iepm.slac.stanford.edu/pinger/pingerworld/slaconly-nodes.cf  or in the XML file http://www-iepm.slac.stanford.edu/pinger/pingerworld/rss.xml .

  • No labels