Overview
In this project we study and investigate network anomaly detection algorithms \[1\] \[2\] \[3\] for Internet Paths. We also develop a _Decision Theoretic Approach_ (DTA) based on our observations about regarding the characteristics of the performance measurements -measurement statistics obtained from the [IEPM-BW] project. Wiki Markup
To study and compare the algorithms we use the data sets collected by IEPM-BW spanning approximately 2 3 years (i.e. 2006 2005 - 2008). The Internet paths observed were the links between Stanford Linear Accelerator Center (SLAC) and the following sites:
- University of Toronto, Canada.
- Deutsches Elektronen-Synchrotron, Germany.
- Forschungszentrum Karlsruhe, Germany.
- San Diego Supercomputing Center (SDSC) USA,
- Oak Ridge National Laboratory (ORNL) USA,
- European Organization for Nuclear Research (CERN) , Geneva, Switzerland.
- San Diego Supercomputing Center,
- Forschungszentrum Karlsruhe (FZK) Germany,
- Deutsches Elektronen- Synchrotron (DESY) Germany and
- University of Toronto (UTORONTO) Canada.
Data Sets
The data sets used in the study may be downloaded from the links listed below. Latest performance statistics may be accessed from here.
| Raw data | Labeled data | |||||
---|---|---|---|---|---|---|---|
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="53ed7069-8b71-416f-a7ba-74c7b653000f"><ac:plain-text-body><![CDATA[ | SDSC | [[csv | http://www.slac.stanford.edu/~kalim/event-detection/published-data/SDSC-pathchirp.csv]], [[xls | http://www.slac.stanford.edu/~kalim/event-detection/published-data/SDSC-pathchirp.xls]] | [[txt | http://www.slac.stanford.edu/~kalim/event-detection/published-data/UTORONTO-pathchirp-labeled-events.txt]] | ]]></ac:plain-text-body></ac:structured-macro> |
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="9761f024-91af-478b-9f07-eb630147cdb0"><ac:plain-text-body><![CDATA[ | ORNL | [csv], [xls] | [txt] | ]]></ac:plain-text-body></ac:structured-macro> | |||
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="0a1b6b28-c75e-43e2-95f8-14c0ca6f3ac3"><ac:plain-text-body><![CDATA[ | CERN | [[csv | http://www.slac.stanford.edu/~kalim/event-detection/published-data/CERN-pathchirp.csv]], [[xls | http://www.slac.stanford.edu/~kalim/event-detection/published-data/CERN-pathchirp.xls]] | [[txt | http://www.slac.stanford.edu/~kalim/event-detection/published-data/CERN-pathchirp-labeled-events.txt]] | ]]></ac:plain-text-body></ac:structured-macro> |
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="86459965-5c22-4ddb-b0d7-6aae94360629"><ac:plain-text-body><![CDATA[ | FZK | [[csv | http://www.slac.stanford.edu/~kalim/event-detection/published-data/FZK-pathchirp.csv]], [[xls | http://www.slac.stanford.edu/~kalim/event-detection/published-data/FZK-pathchirp.xls]] | [[txt | http://www.slac.stanford.edu/~kalim/event-detection/published-data/FZK-pathchirp-labeled-events.txt]] | ]]></ac:plain-text-body></ac:structured-macro> |
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="a3e2046a-2672-4e2b-a91a-56c65c798a43"><ac:plain-text-body><![CDATA[ | DESY | [[csv | http://www.slac.stanford.edu/~kalim/event-detection/published-data/DESY-pathchirp.csv]], [[xls | http://www.slac.stanford.edu/~kalim/event-detection/published-data/DESY-pathchirp.xls]] | [[txt | http://www.slac.stanford.edu/~kalim/event-detection/published-data/DESY-pathchirp-labeled-events.txt]] | ]]></ac:plain-text-body></ac:structured-macro> |
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="e4cd489c-d28a-4d70-ba81-818751764851"><ac:plain-text-body><![CDATA[ | UTORONTO | [[csv | http://www.slac.stanford.edu/~kalim/event-detection/published-data/UTORONTO-pathchirp.csv]], [[xls | http://www.slac.stanford.edu/~kalim/event-detection/published-data/UTORONTO-pathchirp.xls]] | [[txt | http://www.slac.stanford.edu/~kalim/event-detection/published-data/UTORONTO-pathchirp-labeled-events.txt]] | ]]></ac:plain-text-body></ac:structured-macro> |
Labeling Algorithm
The labeling algorithm is as under:
Implementations and Parameter Tuning
The source code of the implementations and the tuning of parameters is discussed below.
References
...
- USA.
- Switch, Switzerland.
- University of Florida, USA.
- National Laboratory for Particle and Nuclear Physics, Canada.
- Oak Ridge National Laboratory, USA.
- Budker Institute of Nuclear Physics, Russia.
- Daresbury Laboratory, United Kingdom.
- California Institute of Technology - CACR, USA.
- Istituto Nazionale di Fisica Nucleare, Italy.
- Czech NREN Operator, Czech Republic.
- Brookhaven National Laboratory, USA.
- Argonne National Laboratory, USA.
- California Institute of Technology - Ultralight, USA.
The topology of the monitoring framework is shown in figure 1.
Fig. 1: Topology of IEPM as of 07/2008 | Fig. 2: Deployment of Selected Sites |
---|---|
|
|
The number of measurements made to the following sites from SLAC:
Site | pathchirp | iperf | thrulay |
---|---|---|---|
cern.ch | 48647 | 24586 | 39510 |
desy.de | 32247 | 4522 | 28689 |
fzk.de | 65536 | 4874 | 42708 |
nslabs.ufl.edu | 41206 | 1549 | 28613 |
switch.edu | 19668 | 4638 | 28744 |
sdsc.edu | 21176 | 4416 | 22456 |
triumf.ca | 26425 | 4669 | 27021 |
utoronto.ca | 40614 | 5003 | 21646 |
ornl.gov | 35339 | 5182 | 18375 |
anl.gov | 17968 | 1 | 27559 |
bnl.org | 23580 | 20708 | 16000 |
cacr.caltech.edu | 61871 | 25525 | 37293 |
dl.ac.uk | 27806 | 6096 | 28058 |
nsk.su | 20117 | 1 | 26845 |
cesnet.cz | 23618 | 3062 | 28426 |
infn.it | 30372 | 4343 | 28573 |
ultralight.caltech | 3739 | 88 | 1534 |
SubTotal | 539929 | 119263 | 452050 |
Data Sets
The data sets used in the study may be downloaded from the links listed below. These data sets were collected by the IEPM-BW project
Table 1: Performance measurement statistics compiled by IEPM, as seen from SLAC.
| Data Sets with Events | Data Sets with no Events |
---|---|---|
IEPM |
All files with name "filename_raw_dataset.pathchirp" contain the raw data i.e the available bandwidth measurements along with the timestamps which are used in all algorithms.
All files with name "filename_event_file.txt" contain the list of events identified.
Technical Report - Labeling and Comparative Analysis
The technical report titled "A performance evaluation of anomaly detection algorithms for Internet Paths" will be available soon.
Input/Tuning parameters
Plateau Algorithm (PL)
History Buffer Length (H) | Trigger Buffer Length (T) | Threshold (th) | Sensitivity (s) |
---|---|---|---|
240 | 6 - 45 | 0.10 - 0.70 | 1.0 - 2.8 |
Kalman Filters Method (KF)
Sensitivity (K) | Time Window (h) |
---|---|
0.001 - 11.0 | 6 - 20 |
Holt Winter's Method (HW)
? - alpha | ? - beta | ? - gamma | ? - sigma |
---|---|---|---|
0.1 | 0.1 - 0.3 | 0.1 - 0.5 | 2.0 |
Adaptive Fault Detector (AFD)
Window Size (N) | ? - alpha | ? - beta | No. of Training Data (Hn) |
---|---|---|---|
20 | 0.95 | 0.0015 - 0.1 | 100 |
Decision Theoretic Approach (DTA)
History Buffer Length (N) | ? - alpha | ? - beta | Median filter length ( n) |
---|---|---|---|
30 - 90 | 0.01 - 0.34 | 0.99 | 100 |
ROC Results
Datasets with Gaussian Distributions
CERN | FZK | SDSC |
---|---|---|
|
|
|
TRIMUF | UTORONTO |
---|---|
|
|
Datasets with Weibull Distributions
DESY | NSLABS | SWITCH |
---|---|---|
|
|
|
...