Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Data outline

The PingER measurements are recorded by the Measurement Agent (MA) into a directory: /usr/local/share/pinger/data (or in the unique case of SLAC into /nfs/slac/g/net/pinger/pinger2/data/). The file names are by month, i.e. ping-<YYYY>-<MM>.txt (e.g. /usr/local/share/pinger/data/ping-2011-02.txt). See PingER data flow at SLAC for more details

...

We might be able to get away without compression. It will simplify the analysis. We will need to experiment to see how much data is created. Each years worth of data compressed for pinger.slac.stanford.edu takes ~ 800MBytes and  I am guessing about 4Gbytes uncompressed (guessing compression ratio ~ 5, this needs verifying). For pinger.cern.ch (a more typical MA) we see ~ 50MBytes/year compressed or 200MBytes uncompressed. There are <~ 100 MAs or say 20Gbytes. So in total we may need ~25Gbytes.  

Task

The task is to take the MA data for each MA for each day of a month and aggregate it into a single file per month. Initially don't bother with compression.

Hints

Write in perl. Start from ~cottrell/bin/template.pl or ~cottrell/sumdir-regexp.pl. Use the perl opendir function to get the directory listing.