Background

This page describes the proceeduring in converting over 10 years of PingER data into a format which is more suitable for analysis and presentation using the perfSONAR PingER MA formats.

PingER Data File Format

All of the old pinger data is stored under:

pinger:/nfs/slac/g/net/pinger/pingerdata/

Each file contains the same data output format; each line represents one probe (of pings) with data stored in the following format:

www.nbi.dk 130.225.212.55 lns62.lns.cornell.edu 128.84.47.62 100 1041466552 10 10 143 144 148 0 1 2 3 4 5 6 7 8 9 143 144 143 143 143 143 148 145 144 143

Field

Data (above)

Description

0

www.nbi.dk

dns name of source of ping

1

130.225.212.55

ip addresss of source of ping

2

lns62.lns.cornell.edu

destination dns name of ping

3

128.84.47.62

destination ip address of ping

4

100

ICMP packet size of ping

5

1041466552

Epoch time stamp of test (start or end?)

6

10

Number of pings sent

7

10

Number of pings recieved

8

143

Minimum Rtt of all pings

9

144

Mean Rtt of all pings

10

148

Maximum Rtt of all pings

11-20

0 1 2 3 4 5 6 7 8 9

Sequence number in the recieving order of pings

21-30

143 144 143 143 143 143 148 145 144 143

Individual ping singleton values

The complexity in the data format lies in the sequence number and ping singleton values which must be derived from the number of pings recieved.

As some of the PingER files were scattered, they were unified into the following location (on pinger.slac.stanford.edu):

/nfs/slac/g/net/pinger/pingerdata/hep/data/<YYYY>/<source>/ping-<YYYY>-<MM>-<DD>.txt.gz

SQL Storage Format

Three tables are required for the storage of the data into SQL tables.

Table holds information about a single ping node.

CREATE TABLE host (
 ip_name varchar(52) NOT NULL, 
 ip_number varchar(64) NOT NULL,
 comments text, 
 PRIMARY KEY  (ip_name, ip_number)); 

The metadata table holds constants of the test settings, such as the source and destination

CREATE TABLE  metaData  (
 metaID BIGINT NOT NULL AUTO_INCREMENT,
 ip_name_src varchar(52),
 ip_name_dst varchar(52),
 packetSize smallint   NOT NULL,
 count smallint   NOT NULL,
 protocol varchar(10)  NOT NULL,
 period smallint,
 deadline smallint,
 ttl smallint,
 INDEX (ip_name_src, ip_name_dst, packetSize, count),
 FOREIGN KEY (ip_name_src) references ipaddr_intervals (ip_name),
 FOREIGN KEY (ip_name_dst) references ipaddr_intervals (ip_name),
 PRIMARY KEY  (metaID));
#   pinger data table, some fields have names differnt from XML schema since there where
#   inherited from the current pinger data table
#   its named data_yyyyMM to separate from old format - pairs_yyyyMM
CREATE TABLE  data_200707  (
 metaID   BIGINT   NOT NULL,
 minRtt float,
 meanRtt float,
 medianRtt float,
 maxRtt float,
 timestamp bigint(12) NOT NULL,
 minIpd float,
 meanIpd float,
 maxIpd float,
 duplicates tinyint(1),
 outOfOrder  tinyint(1),
 clp float,
 iqrIpd float,
 lossPercent  float,
 INDEX (meanRtt, medianRtt, lossPercent, meanIpd, clp),
 FOREIGN KEY (metaID) references metaData (metaID),
 PRIMARY KEY  (metaID, timestamp));

Creation

A database was set up on a host for testing and initial loading purposed. all relevant permmisisons were granted to allow the pinger.slac.stanford.edu host write access to this database.

mysql> create database pinger;
Query OK, 1 row affected (0.00 sec)
mysql -D pinger -p pinger < /u/sf/ytl/Work/perfSONAR/perfSONAR-PS/trunk/perfSONAR-PS/MA/PingER/create_pingerMA_MySQL.sql

Script

A script, pinger-ma_conversion.pl was created (in PingER svn bin) which reads in the raw output from these log files and inserts them into a database.

usage
$ perl pinger-ma_conversion.pl --help
Parses PingER flat file data into the specified databaseUsage: cat <PingER data file> | pinger-ma_conversion.pl [--db
Options:
  --host=s                hostname of database location
  --port=s                port number for database
  --db=s                  database name
  --user=s                username for database
  --password=s            password for database
  --verbose               provide statistical data of inserts
  --help                  this help message
example
zcat /nfs/slac/g/net/pinger/pingerdata/1997/ping-1997-07.txt.gz | perl pinger-ma_conversion.pl --host=<hostname> --port=<port> --user=<dbuser> --password=<dbpassword> --db=<dbname>

Conversion

The log configuratoin file was changed to the following for the conversion process:

bin/logger.conf
log4perl.logger = INFO, A1
log4perl.appender.A1=Log::Dispatch::File
log4perl.appender.A1.filename=/tmp/pinger-ma_conversion.log
log4perl.appender.A1.mode=append
log4perl.appender.A1.layout=Log::Log4perl::Layout::PatternLayout
log4perl.appender.A1.layout.ConversionPattern=%d %m%n

A file containing a list of all of the txt.gz files was created and a simple bash command was created to run through the list:

for f in `cat processing_list`; do echo "Analysing $f"; echo $f >> processed_data_files; zcat $f | perl pinger-ma_conversion.pl --host=<host> --port=<port> --user=<user> --password=<password> --db=<dbname>; done
  • No labels