Statistical detection of intruders within computer networks using scan statistics

Joshua Neil; Curtis Storlie; Curtis Hash; Alex Brugh

doi:10.1142/9781783263752_0003

Statistical detection of intruders within computer networks using scan statistics

Joshua Neil, Curtis Storlie, Curtis Hash, Alex Brugh

Quantitative Health Sciences

Research output: Chapter in Book/Report/Conference proceeding › Chapter

8 Scopus citations

Abstract

We introduce a computationally scalable method for detecting small anomalous subgraphs in large, time-dependent graphs. This work is motivated by, and validated against, the challenge of identifying intruders operating inside enterprise-sized computer networks with 500 million communication events per day. Every observed edge (time series of communications between each pair of computers on the network) is modeled using observed and hidden Markov models to establish baselines of behavior for purposes of anomaly detection. These models capture the bursty, often human-caused, behavior that dominates a large subset of the edges. Individual edge anomalies are common, but the network intrusions we seek to identify always involve coincident anomalies on multiple adjacent edges. We show empirically that adjacent edges are primarily independent and that the likelihood of a subgraph of multiple coincident edges can be evaluated using only models of individual edges. We define a new scan statistic in which subgraphs of specific sizes and shapes (out-stars and 3-paths) are tested. We show that identifying these building-block shapes is sufficient to correctly identify anomalies of various shapes with acceptable false discovery rates in both simulated and real-world examples.

Original language	English (US)
Title of host publication	Data Analysis for Network Cyber-Security
Publisher	Imperial College Press
Pages	71-104
Number of pages	34
ISBN (Electronic)	9781783263752
ISBN (Print)	9781783263745
DOIs	https://doi.org/10.1142/9781783263752_0003
State	Published - Jan 1 2014

ASJC Scopus subject areas

General Computer Science
General Mathematics

Access to Document

10.1142/9781783263752_0003

Cite this

@inbook{30f6ed1b6fac452683aed19e673cc722,

title = "Statistical detection of intruders within computer networks using scan statistics",

abstract = "We introduce a computationally scalable method for detecting small anomalous subgraphs in large, time-dependent graphs. This work is motivated by, and validated against, the challenge of identifying intruders operating inside enterprise-sized computer networks with 500 million communication events per day. Every observed edge (time series of communications between each pair of computers on the network) is modeled using observed and hidden Markov models to establish baselines of behavior for purposes of anomaly detection. These models capture the bursty, often human-caused, behavior that dominates a large subset of the edges. Individual edge anomalies are common, but the network intrusions we seek to identify always involve coincident anomalies on multiple adjacent edges. We show empirically that adjacent edges are primarily independent and that the likelihood of a subgraph of multiple coincident edges can be evaluated using only models of individual edges. We define a new scan statistic in which subgraphs of specific sizes and shapes (out-stars and 3-paths) are tested. We show that identifying these building-block shapes is sufficient to correctly identify anomalies of various shapes with acceptable false discovery rates in both simulated and real-world examples.",

author = "Joshua Neil and Curtis Storlie and Curtis Hash and Alex Brugh",

note = "Publisher Copyright: {\textcopyright} 2014 by Imperial College Press.",

year = "2014",

month = jan,

day = "1",

doi = "10.1142/9781783263752_0003",

language = "English (US)",

isbn = "9781783263745",

pages = "71--104",

booktitle = "Data Analysis for Network Cyber-Security",

publisher = "Imperial College Press",

address = "United Kingdom",

}

TY - CHAP

T1 - Statistical detection of intruders within computer networks using scan statistics

AU - Neil, Joshua

AU - Storlie, Curtis

AU - Hash, Curtis

AU - Brugh, Alex

PY - 2014/1/1

Y1 - 2014/1/1

N2 - We introduce a computationally scalable method for detecting small anomalous subgraphs in large, time-dependent graphs. This work is motivated by, and validated against, the challenge of identifying intruders operating inside enterprise-sized computer networks with 500 million communication events per day. Every observed edge (time series of communications between each pair of computers on the network) is modeled using observed and hidden Markov models to establish baselines of behavior for purposes of anomaly detection. These models capture the bursty, often human-caused, behavior that dominates a large subset of the edges. Individual edge anomalies are common, but the network intrusions we seek to identify always involve coincident anomalies on multiple adjacent edges. We show empirically that adjacent edges are primarily independent and that the likelihood of a subgraph of multiple coincident edges can be evaluated using only models of individual edges. We define a new scan statistic in which subgraphs of specific sizes and shapes (out-stars and 3-paths) are tested. We show that identifying these building-block shapes is sufficient to correctly identify anomalies of various shapes with acceptable false discovery rates in both simulated and real-world examples.

AB - We introduce a computationally scalable method for detecting small anomalous subgraphs in large, time-dependent graphs. This work is motivated by, and validated against, the challenge of identifying intruders operating inside enterprise-sized computer networks with 500 million communication events per day. Every observed edge (time series of communications between each pair of computers on the network) is modeled using observed and hidden Markov models to establish baselines of behavior for purposes of anomaly detection. These models capture the bursty, often human-caused, behavior that dominates a large subset of the edges. Individual edge anomalies are common, but the network intrusions we seek to identify always involve coincident anomalies on multiple adjacent edges. We show empirically that adjacent edges are primarily independent and that the likelihood of a subgraph of multiple coincident edges can be evaluated using only models of individual edges. We define a new scan statistic in which subgraphs of specific sizes and shapes (out-stars and 3-paths) are tested. We show that identifying these building-block shapes is sufficient to correctly identify anomalies of various shapes with acceptable false discovery rates in both simulated and real-world examples.

UR - http://www.scopus.com/inward/record.url?scp=84920261372&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84920261372&partnerID=8YFLogxK

U2 - 10.1142/9781783263752_0003

DO - 10.1142/9781783263752_0003

M3 - Chapter

AN - SCOPUS:84920261372

SN - 9781783263745

SP - 71

EP - 104

BT - Data Analysis for Network Cyber-Security

PB - Imperial College Press

ER -

Statistical detection of intruders within computer networks using scan statistics

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this