Abstract
We introduce a computationally scalable method for detecting small anomalous areas in a large, time-dependent computer network, motivated by the challenge of identifying intruders operating inside enterprise-sized computer networks. Time-series of communications between computers are used to detect anomalies, and are modeled using Markov models that capture the bursty, often human-caused behavior that dominates a large subset of the time-series. Anomalies in these time-series are common, and the network intrusions we seek involve coincident anomalies over multiple connected pairs of computers. We show empirically that each time-series is nearly always independent of the time-series of other pairs of communicating computers. This independence is used to build models of normal activity in local areas from the models of the individual time-series, and these local areas are designed to detect the types of intrusions we are interested in. We define a locality statistic calculated by testing for deviations from historic behavior in each local area, and then define a scan statistic as the maximum deviation score over all local areas. We show that identifying these local anomalies is sufficient to correctly identify anomalies of various relevant shapes in the network. Supplementary material, including additional details and simulation code, are provided online.
Original language | English (US) |
---|---|
Pages (from-to) | 403-414 |
Number of pages | 12 |
Journal | Technometrics |
Volume | 55 |
Issue number | 4 |
DOIs | |
State | Published - Nov 1 2013 |
Keywords
- Anomaly detection
- Dynamic graph
- Network intrusion detection
- Path
- Star
ASJC Scopus subject areas
- Statistics and Probability
- Modeling and Simulation
- Applied Mathematics