You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 11 Next »

Format of data etc

See PingER data flow at SLAC

Saving

The most likely data to be of use to others is the analyzed/aggregated data. This is kept in

/nfs/slac/g/net/pinger/pingerreports/hep/

There is a lot of data so you will need a lot of space.  I suggest you use the /tmp ditrectory as intermediate storage, do one metric at a time (e.g. average_rtt), one size (e.g. 100) and one by (e.g. by-node). If this does not work send an email to unix-admin temporary requesting space in /afs/slac/public/users/cottrell (this is accessible from anonymous FTP). Note you will need space for the copied directory and for the tar's and zipped file.  I requested 100Gbytes.

For 1 metric (average_rtt)

$mkdir /afs/slac/public/users/cottrell/average_rtt-100-by-node

$cp -v /nfs/slac/g/net/pinger/pingerreports/hep/average_rtt/average_rtt-100-by-node* /afs/slac/public/users/cottrell/average_rtt-100-by-node
#There are about 6500 files per metric. Copy takes about 20 mins per metric. 

$tar -cvzf /afs/slac/public/users/cottrell /archive-average_rtt-100-by-node.tar /afs/slac/public/users/cottrell/average_rtt-100-by-node/average_rtt-100-by-node
#A  metric takes about 6 minutes to tar and compress and each tar file occupies ~ 1.5GBytes.

For all metrics with 100Byte pings by node

 $mkdir /afs/slac/public/users/cottrell/metrics-100-by-node

$cp -v /nfs/slac/g/net/pinger/pingerreports/hep/*/*-100-by-node* /afs/slac/public/users/cottrell/metrics-100-by-node

 $tar -cvzf /afs/slac/public/users/cottrell /archive-average_rtt-100-by-node.tar /afs/slac/public/users/cottrell/average_rtt-100-by-node

However:

 $ls /nfs/slac/g/net/pinger/pingerreports/hep/*/*-100-by-node*

/bin/ls: Argument list too long.
Exit 1

To provide maximum flexibility we decided to write a script (pinger-tar.pl) to copy and tar the data. 

 The directory looks as follows:

[cottrell@pinger ~]$ ls -l /afs/slac.stanford.edu/public/users/cottrell/*.tar
-rw-rw-r-- 1 cottrell sf  512297529 Sep 11 16:33 /afs/slac.stanford.edu/public/users/cottrell/MOS-100-by-node.tar
-rw-rw-r-- 1 cottrell sf  755387730 Sep 11 17:00 /afs/slac.stanford.edu/public/users/cottrell/alpha-100-by-node.tar
-rw-rw-r-- 1 cottrell sf 1640854797 Sep 11 13:58 /afs/slac.stanford.edu/public/users/cottrell/average_rtt-100-by-node.tar
-rw-rw-r-- 1 cottrell sf  293583749 Sep 11 18:59 /afs/slac.stanford.edu/public/users/cottrell/conditional_loss_probability-100-by-node.tar
-rw-rw-r-- 1 cottrell sf  259801856 Sep 11 19:40 /afs/slac.stanford.edu/public/users/cottrell/duplicate_packets-100-by-node.tar
-rw-rw-r-- 1 cottrell sf 1018523005 Sep 11 14:27 /afs/slac.stanford.edu/public/users/cottrell/ipdv-100-by-node.tar
-rw-rw-r-- 1 cottrell sf  751314261 Sep 11 16:16 /afs/slac.stanford.edu/public/users/cottrell/iqr-100-by-node.tar
-rw-rw-r-- 1 cottrell sf 1896982390 Sep 11 18:03 /afs/slac.stanford.edu/public/users/cottrell/maximum_rtt-100-by-node.tar
-rw-rw-r-- 1 cottrell sf  297380470 Sep 11 19:53 /afs/slac.stanford.edu/public/users/cottrell/minimum_packet_loss-100-by-node.tar
-rw-rw-r-- 1 cottrell sf 1464424282 Sep 11 13:16 /afs/slac.stanford.edu/public/users/cottrell/minimum_rtt-100-by-node.tar
-rw-rw-r-- 1 cottrell sf  240217457 Sep 11 19:24 /afs/slac.stanford.edu/public/users/cottrell/out_of_order_packets-100-by-node.tar
-rw-rw-r-- 1 cottrell sf  425757104 Sep 11 15:54 /afs/slac.stanford.edu/public/users/cottrell/packet_loss-100-by-node.tar
-rw-rw-r-- 1 cottrell sf 1995419214 Sep 11 15:33 /afs/slac.stanford.edu/public/users/cottrell/throughput-100-by-node.tar
-rw-rw-r-- 1 cottrell sf  587342674 Sep 11 18:39 /afs/slac.stanford.edu/public/users/cottrell/unpredictability-100-by-node.tar
-rw-rw-r-- 1 cottrell sf  273680921 Sep 11 18:16 /afs/slac.stanford.edu/public/users/cottrell/unreachability-100-by-node.tar
-rw-rw-r-- 1 cottrell sf  269081359 Sep 11 19:15 /afs/slac.stanford.edu/public/users/cottrell/zero_packet_loss_frequency-100-by-node.tar

The compression ratio is about 3.2:1. 

To increase the number of files we could change the ping size from 100 to 1000 (in average_rtt-100-by-node* for example) and the by-node to by-site, i.e. a factor of 4.

As of 9/13/2014 there are about 20GBytes of data and 100,000 files.

Nb the following files do not gunzip:

[root@sc2u0n0 afs]# find . -name "*.gz"

./slac/public/users/cottrell/conditional_loss_probability-100-by-node/conditional_loss_probability-100-by-node-1998-02-20.txt.gz

./slac/public/users/cottrell/minimum_packet_loss-100-by-node/minimum_packet_loss-100-by-node-2004-07-18.txt.gz

I get:

268cottrell@pinger:/afs/slac/public/users/cottrell/conditional_loss_probability-100-by-node$gunzip /tmp/conditional_loss_probability-100-by-node-1998-02-20.txt.gz

gzip: /tmp/conditional_loss_probability-100-by-node-1998-02-20.txt.gz: invalid compressed data--crc error

gzip: /tmp/conditional_loss_probability-100-by-node-1998-02-20.txt.gz: invalid compressed data--length error

Retrieving

The data is available via anonymous ftp via ftp://ftp.slac.stanford.edu/users/cottrell, see here for more on retrieving the data.  For more information see: http://www-iepm.slac.stanford.edu/pinger/tools/retrievedata.html

  • No labels