Graph-based malware detection using dynamic analysis

Blake Anderson, Daniel Quist, Joshua Neil, Curtis Storlie, Terran Lane

Research output: Contribution to journalArticlepeer-review

152 Scopus citations

Abstract

We introduce a novel malware detection algorithm based on the analysis of graphs constructed from dynamically collected instruction traces of the target executable. These graphs represent Markov chains, where the vertices are the instructions and the transition probabilities are estimated by the data contained in the trace. We use a combination of graph kernels to create a similarity matrix between the instruction trace graphs. The resulting graph kernel measures similarity between graphs on both local and global levels. Finally, the similarity matrix is sent to a support vector machine to perform classification. Our method is particularly appealing because we do not base our classifications on the raw n-gram data, but rather use our data representation to perform classification in graph space. We demonstrate the performance of our algorithm on two classification problems: benign software versus malware, and the Netbull virus with different packers versus other classes of viruses. Our results show a statistically significant improvement over signature-based and other machine learning-based detection methods.

Original languageEnglish (US)
Pages (from-to)247-258
Number of pages12
JournalJournal in Computer Virology
Volume7
Issue number4
DOIs
StatePublished - Nov 2011

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'Graph-based malware detection using dynamic analysis'. Together they form a unique fingerprint.

Cite this