Author 

Cristiane Ceia UFRJ, cristianeceia@gmail.com

Abstract

This quantifies the Inflation in size of PingER data as it is prepared for Linked Open Data (LOD) access. The size of the PingER hourly data for 2005-Sep 2014 archived via FTP in text form amounts to ~ 3.12GB and this corresponds to 15.66*10^9 (billion) triples. Then using 5  triples for each measurement and using Turtle without compression gives us 685 Gbytes or an inflation factor of ~ 200.

Method

In order to have the number of PingER triples, I processed the quantity of measurement values on PingER hourly data from 1998 to September 2014 (packet size: 100 bytes).

Below, we can see how many measurement values we have per year.

 

#Measurements

1998

6,740,974

1999

8,617,718

2000

11,617,057

2001

13,137,702

2002

7,247,257

2003

14,690,615

2004

36,060,787

2005

32,745,602

2006

38,461,602

2007

89,549,322

2008

115,999,447

2009

150,312,565

2010

203,265,500

2011

441,150,811

2012

697,272,874

2013

733,745,502

2014

531,572,876

 

 

Total

3,132,188,211

These measurement values generate 15,660,941,055 triples.

I am considering a basic description of a measurement following Renan's PingER LOD ontology, in which a measurement is minimally defined by 5 triples. Here is an example:

@prefix : <http://www-iepm.slac.stanford.edu/pinger/lod/resource#> .

@prefix o: <http://www-iepm.slac.stanford.edu/pinger/lod/ontologY/PingEROntology.owl#> .

:EDU.SLAC.STANFORD.N3-BR.UFRJ.PINGER-AverageRTT-15Feb03H23 a o:Measurement ;

o:measuresMetric :AverageRTT ;

o:hasSourceDestinationNodes :EDU.SLAC.STANFORD.N3-BR.UFRJ.PINGER ;

o:hasDateTime :Time15Feb03H23 ;

o:hasValue 233.926 .

The volume in bytes to define one measurement, stored as plain text, utilizing RDF turtle format, WITHOUT any compression or indexing techniques (which commonly reduces size of data and is dependent on the Triple Store we are going to use) gives us 235 bytes. Hence, the estimate for the total triplified data volume in bytes is 235 * #Measurements = 736,064,229,585 bytes (about 685.5 GB).

  • No labels