Walter Scott, Jr. College of Engineering

Graduate Exam Abstract

Vidarshana W. Bandara
Ph.D. Preliminary
Mar 23, 2012, 09:00am
Engineering B3
Characterizing and Detecting Baselines and Anomalies of Network Data
Abstract: A framework for network data decomposition,
modeling, analysis and synthesis is proposed. The
framework views network data as composed of three
components: baseline, anomalies and a residual.
Baseline component represents the behavior trends
under nominal operational conditions, anomalies
are deviations of interest, and the residual
component accommodates the traffic variations not
captured by the baseline and anomalies such as
noise. Each of these components represents a key
aspect of network behaviors and such a
decomposition is of interest to a wide range
network applications including analysis, design,
control and forensics as well as to many
applications beyond the network domain. However,
lack of a formal universally agreed definition for
any of the components is a major challenge.
Nevertheless a host of research work is available
for extracting, analyzing and regenerating network
traffic components. They employ a wide range of
mathematical tools that provides decompositions
and alternative representations for data. But a
decomposition that readily provides a separation
of a baseline, anomalies and residual is lacking.
A single decomposition tool extracting all the
components would reveal their individual features
as well as relationships among these features.
The goal of this work is to develop, innovate and
improvise mathematical tools for network data
decompositions and demonstrate their utility in
applications such as traffic modeling, compressive
representation of network traffic, anomaly
detection, and network monitoring. The proposed
analysis framework builds comprehensive
characterization of each component. Despite the
lack of a formal definition for data components,
features of baselines and anomalies can be
captured by mathematical properties. For example
common features of a dataset containing multiple
traffic traces from different network links, form
a low rank component of the dataset. Such
characterizations, for example, facilitate re-
synthesis of realistic traffic. Based on the
nature of these characteristics, representations
and models for traffic components may be
developed. The obtained characterizations further
enable procedures for extraction and efficient
storage of traffic traces. They also facilitate
extensions such as novel monitoring systems. By
observing relationships among all traffic elements
a complete analysis and synthesis framework is
realized. Such a complete framework will provide
traffic characterizations demanded by
applications, along with relationships between
components, beyond what is currently feasible.
Host of mathematical tools are available for
decomposing based on mathematically identifiable
features. For example, Fourier analysis transforms
data into its frequency spectrum, wavelet
decomposition represents data in a set of scaled
and shifted version of a mother wavelet, Principal
Component Analysis (PCA) separates data into
orthogonal components of decaying variance, and
Robust PCA (RPCA) decomposes data into a low rank
component and a sparse component. Though these
decompositions do not necessarily provide the
baseline, anomalies and the residual, such
mathematical tools can be tailored to be suitable
extractors for components with a proper
characterization. Results presented show how
different characteristics are related to and
extracted by different mathematical tools.
Frequency components with large amplitudes and low
rank component of RPCA capture prominent common
behaviors in dataset, which are typically the
characteristics of a baseline. Scores based on
principal components, thresholded sparse
component, and large deviations in certain
frequency bands capture sudden significant changes
in data. These are useful in detecting anomalies.
Sparsity-inducing norms permit imposing
constraints based on patterns. A set of tools are
proposed catering to datasets with known anomaly
structures based on sparsity-inducing norms.
Real datasets and measurements from sources such
as Internet2, Planet-lab, DARPA, CAIDA, and
Lobster-sensors are used to evaluate the developed
techniques. For example, baseline and anomaly
behaviors of Internet2 are modeled and
characterized. Methods based on PCA and RPCA are
used to filter baseline components. A subspace
representation of baseline in the vector space
representation of data is under investigation.
Here, the baseline component is expected to form a
concentration ellipsoid in a lower dimensional
subspace. Projection methods for baseline
extraction, and distance/angle based techniques
for anomaly detection are proposed. A modularized
approach for anomaly behavioral modeling is also
proposed. Here, Fourier analysis is used to
extract anomalies, and wavelet coefficients are
used in summarizing network activities. Based on
summaries such as correlations and wavelet
coefficients a new breed of monitoring systems
that is informed of network-wide anomaly
activities rather than limited to local
measurements is developed.
Component-wise data analysis provides insight into
various aspects of data, and provides a formal
path for modeling, compression and storage. A
complete suite of methods, tools and
characterizations are brought together in this
work to effectively investigate, analyze,
characterize and model network traffic data. Each
constituent component is comprehensively described
and related to other components. Decomposing data
into components of interest and re-synthesis of
realistic traces with specific properties is made
possible thru this work.
Adviser: Prof. Anura P. Jayasumana
Co-Adviser: Prof. Ali Pezeshki
Non-ECE Member: Prof. Indrajit Ray, Computer Science
Member 3: Prof. J. Rockey Luo, Electrical and Computer Engineering
Addional Members: N/A
Vidarshana Bandara, Ali Pezeshki, Anura, P. Jayasumana, Modeling spatial and temporal behavior of Internet traffic anomalies, IEEE 35th Conference on Local Computer Networks (LCN), pp.384-391, 2010.
V. Bandara, and A. P. Jayasumana, "Extracting Baseline Patterns in Internet Traffic Using Robust Principal Components," Proc. 36th Annual IEEE Conference on Local Computer Networks (LCN), Bonn, Germany, Oct. 2011.
V. Bandara, A. P. Jayasumana, A. Pezeshki, T. H. Illangasekare and K. Barnhardt, "Subsurface Plume Tracking Using Sparse Wireless Sensor Networks," Electronic Journal of Structural Engineering (EJSE) - Special Issue: Wireless Sensor Networks and Practical Applications, Dec. 2010
Thoshitha Gamage, Jayantha Herath, Arjun Roy, and Vidarshana Bandara "Performance Comparison of Recent Random Number Generators," in Journal of Global Information Technology, 2009 3(1 and 2), pp. 1-14.
V. W. Bandara, A. C. Vidanapathirana, and S. G. Abeyratne, "Contouring with DC Motors - a Practical Experience," Proc. First International Conference on Industrial and Information Systems, Aug. 2006, pp474-479.
Asiri Nanayakkara and Vidarshana Bandara, "Asymptotic Behavior of the Eigenenergies of Anharmonic Oscillators V(x)=x2N + bx2," Canadian Journal of Physics/Revue Canadienne de Physique, vol. 80, issue 9, Sep 2002, p959.
Characterizing spatial - temporal features of Internet traffic anomalies (in preparation)
Generalized Bounds for Compressive Sensing Based Recovery (in preparation)
Program of Study: