Statistical detection of intruders within computer networks using scan statistics

Joshua Neil, Curtis Storlie, Curtis Hash, Alex Brugh

Research output: Chapter in Book/Report/Conference proceedingChapter

8 Citations (Scopus)

Abstract

We introduce a computationally scalable method for detecting small anomalous subgraphs in large, time-dependent graphs. This work is motivated by, and validated against, the challenge of identifying intruders operating inside enterprise-sized computer networks with 500 million communication events per day. Every observed edge (time series of communications between each pair of computers on the network) is modeled using observed and hidden Markov models to establish baselines of behavior for purposes of anomaly detection. These models capture the bursty, often human-caused, behavior that dominates a large subset of the edges. Individual edge anomalies are common, but the network intrusions we seek to identify always involve coincident anomalies on multiple adjacent edges. We show empirically that adjacent edges are primarily independent and that the likelihood of a subgraph of multiple coincident edges can be evaluated using only models of individual edges. We define a new scan statistic in which subgraphs of specific sizes and shapes (out-stars and 3-paths) are tested. We show that identifying these building-block shapes is sufficient to correctly identify anomalies of various shapes with acceptable false discovery rates in both simulated and real-world examples.

Original languageEnglish (US)
Title of host publicationData Analysis for Network Cyber-Security
PublisherImperial College Press
Pages71-104
Number of pages34
ISBN (Electronic)9781783263752
ISBN (Print)9781783263745
DOIs
StatePublished - Jan 1 2014
Externally publishedYes

Fingerprint

Scan Statistic
Computer Networks
Computer networks
Anomaly
Subgraph
Coincident
Statistics
Communication
Hidden Markov models
Adjacent
Stars
Time series
Anomaly Detection
Human Behavior
Building Blocks
Markov Model
Anomalous
Baseline
Star
Likelihood

ASJC Scopus subject areas

  • Computer Science(all)
  • Mathematics(all)

Cite this

Neil, J., Storlie, C., Hash, C., & Brugh, A. (2014). Statistical detection of intruders within computer networks using scan statistics. In Data Analysis for Network Cyber-Security (pp. 71-104). Imperial College Press. https://doi.org/10.1142/9781783263752_0003

Statistical detection of intruders within computer networks using scan statistics. / Neil, Joshua; Storlie, Curtis; Hash, Curtis; Brugh, Alex.

Data Analysis for Network Cyber-Security. Imperial College Press, 2014. p. 71-104.

Research output: Chapter in Book/Report/Conference proceedingChapter

Neil, J, Storlie, C, Hash, C & Brugh, A 2014, Statistical detection of intruders within computer networks using scan statistics. in Data Analysis for Network Cyber-Security. Imperial College Press, pp. 71-104. https://doi.org/10.1142/9781783263752_0003
Neil J, Storlie C, Hash C, Brugh A. Statistical detection of intruders within computer networks using scan statistics. In Data Analysis for Network Cyber-Security. Imperial College Press. 2014. p. 71-104 https://doi.org/10.1142/9781783263752_0003
Neil, Joshua ; Storlie, Curtis ; Hash, Curtis ; Brugh, Alex. / Statistical detection of intruders within computer networks using scan statistics. Data Analysis for Network Cyber-Security. Imperial College Press, 2014. pp. 71-104
@inbook{30f6ed1b6fac452683aed19e673cc722,
title = "Statistical detection of intruders within computer networks using scan statistics",
abstract = "We introduce a computationally scalable method for detecting small anomalous subgraphs in large, time-dependent graphs. This work is motivated by, and validated against, the challenge of identifying intruders operating inside enterprise-sized computer networks with 500 million communication events per day. Every observed edge (time series of communications between each pair of computers on the network) is modeled using observed and hidden Markov models to establish baselines of behavior for purposes of anomaly detection. These models capture the bursty, often human-caused, behavior that dominates a large subset of the edges. Individual edge anomalies are common, but the network intrusions we seek to identify always involve coincident anomalies on multiple adjacent edges. We show empirically that adjacent edges are primarily independent and that the likelihood of a subgraph of multiple coincident edges can be evaluated using only models of individual edges. We define a new scan statistic in which subgraphs of specific sizes and shapes (out-stars and 3-paths) are tested. We show that identifying these building-block shapes is sufficient to correctly identify anomalies of various shapes with acceptable false discovery rates in both simulated and real-world examples.",
author = "Joshua Neil and Curtis Storlie and Curtis Hash and Alex Brugh",
year = "2014",
month = "1",
day = "1",
doi = "10.1142/9781783263752_0003",
language = "English (US)",
isbn = "9781783263745",
pages = "71--104",
booktitle = "Data Analysis for Network Cyber-Security",
publisher = "Imperial College Press",
address = "United Kingdom",

}

TY - CHAP

T1 - Statistical detection of intruders within computer networks using scan statistics

AU - Neil, Joshua

AU - Storlie, Curtis

AU - Hash, Curtis

AU - Brugh, Alex

PY - 2014/1/1

Y1 - 2014/1/1

N2 - We introduce a computationally scalable method for detecting small anomalous subgraphs in large, time-dependent graphs. This work is motivated by, and validated against, the challenge of identifying intruders operating inside enterprise-sized computer networks with 500 million communication events per day. Every observed edge (time series of communications between each pair of computers on the network) is modeled using observed and hidden Markov models to establish baselines of behavior for purposes of anomaly detection. These models capture the bursty, often human-caused, behavior that dominates a large subset of the edges. Individual edge anomalies are common, but the network intrusions we seek to identify always involve coincident anomalies on multiple adjacent edges. We show empirically that adjacent edges are primarily independent and that the likelihood of a subgraph of multiple coincident edges can be evaluated using only models of individual edges. We define a new scan statistic in which subgraphs of specific sizes and shapes (out-stars and 3-paths) are tested. We show that identifying these building-block shapes is sufficient to correctly identify anomalies of various shapes with acceptable false discovery rates in both simulated and real-world examples.

AB - We introduce a computationally scalable method for detecting small anomalous subgraphs in large, time-dependent graphs. This work is motivated by, and validated against, the challenge of identifying intruders operating inside enterprise-sized computer networks with 500 million communication events per day. Every observed edge (time series of communications between each pair of computers on the network) is modeled using observed and hidden Markov models to establish baselines of behavior for purposes of anomaly detection. These models capture the bursty, often human-caused, behavior that dominates a large subset of the edges. Individual edge anomalies are common, but the network intrusions we seek to identify always involve coincident anomalies on multiple adjacent edges. We show empirically that adjacent edges are primarily independent and that the likelihood of a subgraph of multiple coincident edges can be evaluated using only models of individual edges. We define a new scan statistic in which subgraphs of specific sizes and shapes (out-stars and 3-paths) are tested. We show that identifying these building-block shapes is sufficient to correctly identify anomalies of various shapes with acceptable false discovery rates in both simulated and real-world examples.

UR - http://www.scopus.com/inward/record.url?scp=84920261372&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84920261372&partnerID=8YFLogxK

U2 - 10.1142/9781783263752_0003

DO - 10.1142/9781783263752_0003

M3 - Chapter

AN - SCOPUS:84920261372

SN - 9781783263745

SP - 71

EP - 104

BT - Data Analysis for Network Cyber-Security

PB - Imperial College Press

ER -