Automating reverse engineering with machine learning techniques

Blake Anderson, Curtis Storlie, Micah Yates, Aaron McPhall

Research output: Contribution to journalArticlepeer-review

5 Scopus citations


Malware continues to be an ongoing threat, with millions of unique variants created every year. Unlike the majority of this malware, Advanced Persistent Threat (APT) malware is created to target a specific network or set of networks and has a precise objective, e.g. exfiltrating sensitive data. While 0-day malware detectors are a good start, they do not help the reverse engineers better understand the threats attacking their networks. Understanding the behavior of malware is often a time sensitive task, and can take anywhere between several hours to several weeks. Our goal is to automate the task of identifying the general function of the subroutines in the function call graph of the program to aid the reverse engineers. Two approaches to model the subroutine labels are investigated, a multiclass Gaussian process and a multiclass support vector machine. The output of these methods is the probability that the subroutine belongs to a certain class of functionality (e.g., file I/O, exploit, etc.). Promising initial results, illustrating the efficacy of this method, are presented on a sample of 201 subroutines taken from two malicious families.

Original languageEnglish (US)
Pages (from-to)103-112
Number of pages10
JournalUnknown Journal
Issue numberNovember
StatePublished - Nov 7 2014
Externally publishedYes


  • Computer security
  • Gaussian processes
  • Machine learning
  • Malware
  • Multiple kernel learning
  • Support vector machines

ASJC Scopus subject areas

  • Software
  • Computer Networks and Communications

Fingerprint Dive into the research topics of 'Automating reverse engineering with machine learning techniques'. Together they form a unique fingerprint.

Cite this