Scan statistics for the online detection of locally anomalous subgraphs

Joshua Neil; Curtis Hash; Alexander Brugh; Mike Fisk; Curtis B. Storlie

doi:10.1080/00401706.2013.822830

Scan statistics for the online detection of locally anomalous subgraphs

Joshua Neil, Curtis Hash, Alexander Brugh, Mike Fisk, Curtis B. Storlie

Quantitative Health Sciences

Research output: Contribution to journal › Article › peer-review

75 Scopus citations

Abstract

We introduce a computationally scalable method for detecting small anomalous areas in a large, time-dependent computer network, motivated by the challenge of identifying intruders operating inside enterprise-sized computer networks. Time-series of communications between computers are used to detect anomalies, and are modeled using Markov models that capture the bursty, often human-caused behavior that dominates a large subset of the time-series. Anomalies in these time-series are common, and the network intrusions we seek involve coincident anomalies over multiple connected pairs of computers. We show empirically that each time-series is nearly always independent of the time-series of other pairs of communicating computers. This independence is used to build models of normal activity in local areas from the models of the individual time-series, and these local areas are designed to detect the types of intrusions we are interested in. We define a locality statistic calculated by testing for deviations from historic behavior in each local area, and then define a scan statistic as the maximum deviation score over all local areas. We show that identifying these local anomalies is sufficient to correctly identify anomalies of various relevant shapes in the network. Supplementary material, including additional details and simulation code, are provided online.

Original language	English (US)
Pages (from-to)	403-414
Number of pages	12
Journal	Technometrics
Volume	55
Issue number	4
DOIs	https://doi.org/10.1080/00401706.2013.822830
State	Published - Nov 1 2013

Keywords

Anomaly detection
Dynamic graph
Network intrusion detection
Path
Star

ASJC Scopus subject areas

Statistics and Probability
Modeling and Simulation
Applied Mathematics

Access to Document

10.1080/00401706.2013.822830

Cite this

@article{95876c0f62ba40fab7a7d2ffb1c7bddf,

title = "Scan statistics for the online detection of locally anomalous subgraphs",

abstract = "We introduce a computationally scalable method for detecting small anomalous areas in a large, time-dependent computer network, motivated by the challenge of identifying intruders operating inside enterprise-sized computer networks. Time-series of communications between computers are used to detect anomalies, and are modeled using Markov models that capture the bursty, often human-caused behavior that dominates a large subset of the time-series. Anomalies in these time-series are common, and the network intrusions we seek involve coincident anomalies over multiple connected pairs of computers. We show empirically that each time-series is nearly always independent of the time-series of other pairs of communicating computers. This independence is used to build models of normal activity in local areas from the models of the individual time-series, and these local areas are designed to detect the types of intrusions we are interested in. We define a locality statistic calculated by testing for deviations from historic behavior in each local area, and then define a scan statistic as the maximum deviation score over all local areas. We show that identifying these local anomalies is sufficient to correctly identify anomalies of various relevant shapes in the network. Supplementary material, including additional details and simulation code, are provided online.",

keywords = "Anomaly detection, Dynamic graph, Network intrusion detection, Path, Star",

author = "Joshua Neil and Curtis Hash and Alexander Brugh and Mike Fisk and Storlie, {Curtis B.}",

year = "2013",

month = nov,

day = "1",

doi = "10.1080/00401706.2013.822830",

language = "English (US)",

volume = "55",

pages = "403--414",

journal = "Technometrics",

issn = "0040-1706",

publisher = "American Statistical Association",

number = "4",

}

TY - JOUR

T1 - Scan statistics for the online detection of locally anomalous subgraphs

AU - Neil, Joshua

AU - Hash, Curtis

AU - Brugh, Alexander

AU - Fisk, Mike

AU - Storlie, Curtis B.

PY - 2013/11/1

Y1 - 2013/11/1

N2 - We introduce a computationally scalable method for detecting small anomalous areas in a large, time-dependent computer network, motivated by the challenge of identifying intruders operating inside enterprise-sized computer networks. Time-series of communications between computers are used to detect anomalies, and are modeled using Markov models that capture the bursty, often human-caused behavior that dominates a large subset of the time-series. Anomalies in these time-series are common, and the network intrusions we seek involve coincident anomalies over multiple connected pairs of computers. We show empirically that each time-series is nearly always independent of the time-series of other pairs of communicating computers. This independence is used to build models of normal activity in local areas from the models of the individual time-series, and these local areas are designed to detect the types of intrusions we are interested in. We define a locality statistic calculated by testing for deviations from historic behavior in each local area, and then define a scan statistic as the maximum deviation score over all local areas. We show that identifying these local anomalies is sufficient to correctly identify anomalies of various relevant shapes in the network. Supplementary material, including additional details and simulation code, are provided online.

AB - We introduce a computationally scalable method for detecting small anomalous areas in a large, time-dependent computer network, motivated by the challenge of identifying intruders operating inside enterprise-sized computer networks. Time-series of communications between computers are used to detect anomalies, and are modeled using Markov models that capture the bursty, often human-caused behavior that dominates a large subset of the time-series. Anomalies in these time-series are common, and the network intrusions we seek involve coincident anomalies over multiple connected pairs of computers. We show empirically that each time-series is nearly always independent of the time-series of other pairs of communicating computers. This independence is used to build models of normal activity in local areas from the models of the individual time-series, and these local areas are designed to detect the types of intrusions we are interested in. We define a locality statistic calculated by testing for deviations from historic behavior in each local area, and then define a scan statistic as the maximum deviation score over all local areas. We show that identifying these local anomalies is sufficient to correctly identify anomalies of various relevant shapes in the network. Supplementary material, including additional details and simulation code, are provided online.

KW - Anomaly detection

KW - Dynamic graph

KW - Network intrusion detection

KW - Path

KW - Star

UR - http://www.scopus.com/inward/record.url?scp=84890034467&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84890034467&partnerID=8YFLogxK

U2 - 10.1080/00401706.2013.822830

DO - 10.1080/00401706.2013.822830

M3 - Article

AN - SCOPUS:84890034467

SN - 0040-1706

VL - 55

SP - 403

EP - 414

JO - Technometrics

JF - Technometrics

IS - 4

ER -

Scan statistics for the online detection of locally anomalous subgraphs

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this