Page History

Overview

Wiki MarkupIn this project we study and investigate network anomaly detection algorithms \[1\], \[2\] and \[3\] for Internet Paths. We also develop a _Decision Theoretic Approach_ (DTA) based on our observations regarding the characteristics of the performance-measurement statistics obtained from the [IEPM-BW] project.

To study and compare the algorithms we use the data sets collected by IEPM-BW spanning approximately 2 3 years (i.e. 2006 2005 - 2008). The Internet paths observed were the links between Stanford Linear Accelerator Center (SLAC) and the following sites:

University of Toronto, Canada.
Deutsches Elektronen-Synchrotron, Germany.
Forschungszentrum Karlsruhe, Germany.
San Diego Supercomputing Center (SDSC) USA,
Oak Ridge National Laboratory (ORNL) USA,
European Organization for Nuclear Research (CERN) , Geneva, Switzerland,
Forschungszentrum Karlsruhe (FZK) Germany,
Deutsches Elektronen-Synchrotron (DESY) Germany,
.
San Diego Supercomputing Center, USA.
Switch, Switzerland.
University of Florida, USA.University of Toronto (UTORONTO) Canada and
National Laboratory for Particle and Nuclear Physics (TRIUMF) , Canada.

The topology of the monitoring framework is shown in figure 1.

Fig. 1: Topology of IEPM as of 07/2008
Image Removed

Data Sets

The data sets used in the study may be downloaded from the links listed below. These data sets were collected by the IEPM-BW project and the latest performance statistics may be accessed from here.

Table 1: Performance measurement statistics compiled by IEPM, as seen from SLAC.

	Raw data	Labeled data
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="628112f7-0aaa-4980-864e-5853ee6f9fa0"><ac:plain-text-body><![CDATA[	SDSC	[[csv	http://www.slac.stanford.edu/~kalim/event-detection/published-data/SDSC-pathchirp.csv]], [[xls	http://www.slac.stanford.edu/~kalim/event-detection/published-data/SDSC-pathchirp.xls]]	[[txt	http://www.slac.stanford.edu/~kalim/event-detection/published-data/UTORONTO-pathchirp-labeled-events.txt]]	]]></ac:plain-text-body></ac:structured-macro>
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="933d91f1-a944-4dac-bd77-d19b53cacb4b"><ac:plain-text-body><![CDATA[	CERN	[[csv	http://www.slac.stanford.edu/~kalim/event-detection/published-data/CERN-pathchirp.csv]], [[xls	http://www.slac.stanford.edu/~kalim/event-detection/published-data/CERN-pathchirp.xls]]	[[txt	http://www.slac.stanford.edu/~kalim/event-detection/published-data/CERN-pathchirp-labeled-events.txt]]	]]></ac:plain-text-body></ac:structured-macro>
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="cbb9d23d-4116-4c0f-b501-62b82e815914"><ac:plain-text-body><![CDATA[	FZK	[[csv	http://www.slac.stanford.edu/~kalim/event-detection/published-data/FZK-pathchirp.csv]], [[xls	http://www.slac.stanford.edu/~kalim/event-detection/published-data/FZK-pathchirp.xls]]	[[txt	http://www.slac.stanford.edu/~kalim/event-detection/published-data/FZK-pathchirp-labeled-events.txt]]	]]></ac:plain-text-body></ac:structured-macro>
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="87896d2e-45af-4d47-bc27-cdfcbe65a5eb"><ac:plain-text-body><![CDATA[	DESY	[[csv	http://www.slac.stanford.edu/~kalim/event-detection/published-data/DESY-pathchirp.csv]], [[xls	http://www.slac.stanford.edu/~kalim/event-detection/published-data/DESY-pathchirp.xls]]	[[txt	http://www.slac.stanford.edu/~kalim/event-detection/published-data/DESY-pathchirp-labeled-events.txt]]	]]></ac:plain-text-body></ac:structured-macro>
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="ea096257-3f6b-44f9-b32e-b1ffcbc631b5"><ac:plain-text-body><![CDATA[	UTORONTO	[[csv	http://www.slac.stanford.edu/~kalim/event-detection/published-data/UTORONTO-pathchirp.csv]], [[xls	http://www.slac.stanford.edu/~kalim/event-detection/published-data/UTORONTO-pathchirp.xls]]	[[txt	http://www.slac.stanford.edu/~kalim/event-detection/published-data/UTORONTO-pathchirp-labeled-events.txt]]	]]></ac:plain-text-body></ac:structured-macro>
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="252dacd4-f5ef-4fb1-874f-d404a0cf43ae"><ac:plain-text-body><![CDATA[	TRIUMF	[[csv	http://www.slac.stanford.edu/~kalim/event-detection/published-data/TRIUMF-pathchirp.csv]], [[xls	http://www.slac.stanford.edu/~kalim/event-detection/published-data/TRIUMF-pathchirp.xls]]	[[txt	http://www.slac.stanford.edu/~kalim/event-detection/published-data/TRIUMF-pathchirp-labeled-events.txt]]	]]></ac:plain-text-body></ac:structured-macro>
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="d3b43d4b-9582-4942-bdf2-d0156d16a0fa"><ac:plain-text-body><![CDATA[	ORNL	[[csv	http://www.slac.stanford.edu/~kalim/event-detection/published-data/ORNL-pathchirp.csv]], [[xls	http://www.slac.stanford.edu/~kalim/event-detection/published-data/ORNL-pathchirp.xls]]	[txt]	]]></ac:plain-text-body></ac:structured-macro>

Wiki Markup

Download the complete data archive \[[zip|http://www.slac.stanford.edu/~kalim/event-detection/published-data/published-data.zip] 10 MB\] or \[[7z|http://www.slac.stanford.edu/~kalim/event-detection/published-data/published-data.7z] 8 MB\]

Labeling and Detection Algorithms

Wiki Markup

To perform a fair comparison between \[1\], \[2\], \[3\] and the proposed DTA we devised a labeling algorithm to identify true anomalies in the data sets. This labeled data was then used to determine the accuracy (true-positive rate), corresponding false-positive rate and the detection delay. The labeling algorithm and the decision theoretic approach for real-time anomaly detection are discussed at length in the research paper. (_The paper will be posted soon_.)
\\

Implementations and Parameter Tuning

The source code of the implementations and the tuning of parameters is discussed below. Please consider two important factors while we discuss the parameters tuning: a) We define an anomalous event as one which significantly varies from the normal behaviour of the Internet Path and the perturbation persists for at least three hours. b) The measurements made by IEPM-BW are at least 45 mins apart.

Fig. 2: ROC curves for Plateau (PL), Adaptive Fault Detection (AFD), Kalman Filters (KF) and the proposed Decision Theoretic Approach (DTA)
Image Removed

a) Plateau Algorithm

Wiki Markup

The [Plateau Algorithm|http://www.acm.org/sigs/sigcomm/sigcomm2004/workshop_papers/nts26-logg1.pdf] \[1\] takes four input parameters - history buffer length, trigger buffer length, sensitivity and threshold. We define the trigger buffer length to hold observations spanning at least 3 hours (since we define an anomalous event as one that persists for at least 3 hours). Since the measurements made by IEPM are 45 mins apart, we get about 6 observations. We then empirically determine the optimal length of the history buffer to span at least 5 days ~ 240 observations.

We initially define the sensitivity as one (1) and determine the optimal threshold value for each link (i.e. the value with the optimal ratio of true-positives to false positives). As suggested by the authors, this threshold falls between 30%-40%. Finally for the history buffer length, the trigger buffer length and the optimal threshold, we vary the sensitivity to determine the spectrum of the ratio of true-positive rate to false positive rate. Based on the observations we plot the ROC curves as shown in figure 2. One set of data points is shown in table 2.

Table 2: Results compiled by Plateau algorithm when analyzing the Internet path between SLAC and DESY.

H	TB	TH	S	TP	FP	#P	N~days	TPR	FPR
240	45	30%	2.3	0	25	31	672	0	0.037
240	40	30%	2.2	1	28	31	672	0.032	0.042
240	32	30%	2.1	1	32	31	672	0.032	0.048
240	28	30%	2.0	2	35	31	672	0.065	0.052
240	24	30%	1.9	2	41	31	672	0.065	0.061
240	19	30%	1.8	4	50	31	672	0.129	0.074
240	16	30%	1.7	6	59	31	672	0.194	0.088
240	13	30%	1.6	7	66	31	672	0.226	0.098
240	10	30%	1.5	11	74	31	672	0.355	0.11
240	9	30%	1.4	13	81	31	672	0.419	0.121
240	8	30%	1.3	14	97	31	672	0.452	0.144
240	7	30%	1.2	16	103	31	672	0.516	0.153
240	6	30%	1.1	20	114	31	672	0.645	0.17
240	6	30%	1.0	21	122	31	672	0.677	0.182

Legend:
H - History buffer length, TB - Trigger buffer length, TH - threshold, S - sensitivity, TP - true positives obtained, FP - false positives obtained, #P - true positives, N~days - days observed, TPR - true positive rate, FPR - false positive rate.

Wiki Markup

The source code of the implementation is available in [C#|http://www.slac.stanford.edu/~kalim/event-detection/published-src/c-plateau-impl.html] \[[cs|http://www.slac.stanford.edu/~kalim/event-detection/published-src/plateau.cs]\] and [perl|http://www.slac.stanford.edu/~kalim/event-detection/published-src/plateau.pl.html] \[[pl|http://www.slac.stanford.edu/~kalim/event-detection/published-src/plateau.pl]\]

b) Adaptive Fault Detection

Wiki Markup

The Adaptive Fault Detection \[3\] method employs four parameters - the desidred detection rate, the desired false positive rate, the history buffer length and the observed window length. We use the values as suggested by the author and for observed window length of 20, a history buffer length of 100 and the desired detection rate of 0.95 and we vary the desired false positive rate to obtain the spectrum of the ratio of true-positive rate to false positive rate. Based on the observations we plot the ROC curves as shown in figure 2. One set of data points is shown in table 3.

Table 3: Results compiled by Adaptive Fault Detection (ADF) method when analyzing the Internet path between SLAC and DESY.

N	HN	B	a	TP	FP	#P	N~days	TPR	FPR
20	100	0.95	0.0300	0	31	31	672	0	0.046
20	100	0.95	0.0200	2	90	31	672	0.065	0.134
20	100	0.95	0.0090	6	619	31	672	0.194	0.921
20	100	0.95	0.0080	9	796	31	672	0.29	1.185
20	100	0.95	0.0070	16	994	31	672	0.516	1.479
20	100	0.95	0.0050	23	1672	31	672	0.742	2.488
20	100	0.95	0.0040	23	2179	31	672	0.742	3.243
20	100	0.95	0.0030	29	2837	31	672	0.935	4.222
20	100	0.95	0.0023	30	3528	31	672	0.968	5.25
20	100	0.95	0.0022	31	3611	31	672	1	5.374

Legend:
HN - History buffer length, a - desired false positive rate, B - desired detection rate, N - observed window length, TP - true positives obtained, FP - false positives obtained, #P - true positives, N~days - days observed, TPR - true positive rate, FPR - false positive rate.

Wiki Markup

The source code of the implementation is available in [C#|http://www.slac.stanford.edu/~kalim/event-detection/published-src/c-afd-impl.html] \[[cs|http://www.slac.stanford.edu/~kalim/event-detection/published-src/afd.cs]\].

c) Kalman Filters

Wiki Markup

The Kalman Filter \[2\] method employs two input parameters - the observed window length and the Kalman gain. We use the values as suggested by the author and for the observed window length and vary the kalman gain to obtain the spectrum of the ratio of true-positive rate to false positive rate. Based on the observations we plot the ROC curves as shown in figure 2. One set of data points is shown in table 4.

Table 4: Results compiled by Kalman Filters (KF) method when analyzing the Internet path between SLAC and DESY.

W	K	TP	FP	#P	N~days	TPR	FPR
3	0.800	1	309	31	672	0.032	0.46
6	0.500	2	333	31	672	0.065	0.496
6	0.100	3	366	31	672	0.097	0.545
6	0.005	3	371	31	672	0.097	0.552
6	0.001	3	371	31	672	0.097	0.552

Legend:
W- observed window length, K - kalman gain, TP - true positives obtained, FP - false positives obtained, #P - true positives, N~days - days observed, TPR - true positive rate, FPR - false positive rate.

Wiki Markup

The source code of the implementation is available in [C#|http://www.slac.stanford.edu/~kalim/event-detection/published-src/c-kf-impl.html] \[[cs|http://www.slac.stanford.edu/~kalim/event-detection/published-src/kf.cs]\]. (not up to date, will be revised soon)

d) Decision Theoretic Approach

Decision Theoretic Approach employs four parameters - the desired false positive rate (alpha), the desired detection rate (beta), the history buffer length and the low-pass median filter length. We apply the median filter of length 7 to remove the high frequency component. We then determine the optimum history buffer length by empirical observation. This turns out to be 90. Opting for the maximum detection rate of 0.99, we vary the desired false positive rate to obtain the entire spectrum of the ratio of true-positive rate to false positive rate. Based on the observations we plot the ROC curves as shown in figure 2. One set of data points is shown in table 5.

This algorithm use three parameters one is threshold which is calculated in terms of Alpha and Beta, and other two are buffer lengths (History buffer and n-Tap Median filter). Same scheme is used for this algorithm, such that minimum value of n = 7 is used for n-Tap Median filter with some appropriate value of History buffer. By keeping these values same and changing different values of Alpha and Beta (Threshold) for each dataset ROCs are generated.

Table 5: Results compiled by the proposed DTA when analyzing the Internet path between SLAC and DESY.

H	a	B	n-tap	TP	FP	#P	N~days	TPR	FPR
90	0.01100	0.99	7	0	9	31	672	0	0.013
90	0.01150	0.99	7	4	9	31	672	0.129	0.013
90	0.01200	0.99	7	4	12	31	672	0.129	0.018
90	0.01400	0.99	7	13	26	31	672	0.419	0.039
90	0.01600	0.99	7	17	31	31	672	0.548	0.046
90	0.01800	0.99	7	18	45	31	672	0.581	0.067
90	0.02000	0.99	7	22	69	31	672	0.71	0.103
90	0.02100	0.99	7	25	84	31	672	0.806	0.125
90	0.02160	0.99	7	29	88	31	672	0.935	0.131
90	0.02163	0.99	7	29	91	31	672	0.935	0.135
90	0.02164	0.99	7	30	91	31	672	0.968	0.135

Legend:
H - History buffer length, a - desired false positive rate, B - desired detection rate, n-tap - median filter length, TP - true positives obtained, FP - false positives obtained, #P - true positives, N~days - days observed, TPR - true positive rate, FPR - false positive rate.

Wiki Markup

The source code of the implementation is available in [C#|http://www.slac.stanford.edu/~kalim/event-detection/published-src/c-dta-impl.html] \[[cs|http://www.slac.stanford.edu/~kalim/event-detection/published-src/dta.cs]\] and [perl|http://www.slac.stanford.edu/~kalim/event-detection/published-src/dta.pl.html] \[[pl|http://www.slac.stanford.edu/~kalim/event-detection/published-src/dta.pl]\] ([utilityFunctions|http://www.slac.stanford.edu/~kalim/event-detection/published-src/utilityFunctions.pm.html] \[[pm|http://www.slac.stanford.edu/~kalim/event-detection/published-src/utilityFunctions.pm]\], [conf|http://www.slac.stanford.edu/~kalim/event-detection/published-src/dta.conf]).

References

...

Oak Ridge National Laboratory, USA.
Budker Institute of Nuclear Physics, Russia.
Daresbury Laboratory, United Kingdom.
California Institute of Technology - CACR, USA.
Istituto Nazionale di Fisica Nucleare, Italy.
Czech NREN Operator, Czech Republic.
Brookhaven National Laboratory, USA.
Argonne National Laboratory, USA.
California Institute of Technology - Ultralight, USA.

The topology of the monitoring framework is shown in figure 1.

Fig. 1: Topology of IEPM as of 07/2008	Fig. 2: Deployment of Selected Sites
Image Added	Image Added

The number of measurements made to the following sites from SLAC:

Site	pathchirp	iperf	thrulay
cern.ch	48647	24586	39510
desy.de	32247	4522	28689
fzk.de	65536	4874	42708
nslabs.ufl.edu	41206	1549	28613
switch.edu	19668	4638	28744
sdsc.edu	21176	4416	22456
triumf.ca	26425	4669	27021
utoronto.ca	40614	5003	21646
ornl.gov	35339	5182	18375
anl.gov	17968	1	27559
bnl.org	23580	20708	16000
cacr.caltech.edu	61871	25525	37293
dl.ac.uk	27806	6096	28058
nsk.su	20117	1	26845
cesnet.cz	23618	3062	28426
infn.it	30372	4343	28573
ultralight.caltech	3739	88	1534
SubTotal	539929	119263	452050

Data Sets

The data sets used in the study may be downloaded from the links listed below. These data sets were collected by the IEPM-BW project

Table 1: Performance measurement statistics compiled by IEPM, as seen from SLAC.

	Data Sets with Events	Data Sets with no Events
IEPM	[rar] 3.4 MB, [zip] 3.6 MB	[rar] 3.3 MB, [zip] 3.5 MB

All files with name "filename_raw_dataset.pathchirp" contain the raw data i.e the available bandwidth measurements along with the timestamps which are used in all algorithms.
All files with name "filename_event_file.txt" contain the list of events identified.

Technical Report - Labeling and Comparative Analysis

The technical report titled "A performance evaluation of anomaly detection algorithms for Internet Paths" will be available soon.

Input/Tuning parameters

Plateau Algorithm (PL)

History Buffer Length (H)	Trigger Buffer Length (T)	Threshold (th)	Sensitivity (s)
240	6 - 45	0.10 - 0.70	1.0 - 2.8

Kalman Filters Method (KF)

Sensitivity (K)	Time Window (h)
0.001 - 11.0	6 - 20

Holt Winter's Method (HW)

? - alpha	? - beta	? - gamma	? - sigma
0.1	0.1 - 0.3	0.1 - 0.5	2.0

Adaptive Fault Detector (AFD)

Window Size (N)	? - alpha	? - beta	No. of Training Data (Hn)
20	0.95	0.0015 - 0.1	100

Decision Theoretic Approach (DTA)

History Buffer Length (N)	? - alpha	? - beta	Median filter length ( n)
30 - 90	0.01 - 0.34	0.99	100

ROC Results

Datasets with Gaussian Distributions

CERN	FZK	SDSC
Image Added	Image Added	Image Added

TRIMUF	UTORONTO
Image Added	Image Added

Datasets with Weibull Distributions

DESY	NSLABS	SWITCH
Image Added	Image Added	Image Added

Beginning November 4, 2024, login to Confluence and Jira will change. Read more.

Child pages

Versions Compared

Old Version 18

New Version Current

Key

Overview

Data Sets

Labeling and Detection Algorithms

Implementations and Parameter Tuning

a) Plateau Algorithm

b) Adaptive Fault Detection

c) Kalman Filters

d) Decision Theoretic Approach

References

Data Sets

Technical Report - Labeling and Comparative Analysis

Input/Tuning parameters

Plateau Algorithm (PL)

Kalman Filters Method (KF)

Holt Winter's Method (HW)

Adaptive Fault Detector (AFD)

Decision Theoretic Approach (DTA)

ROC Results

Datasets with Gaussian Distributions

Datasets with Weibull Distributions