Improving malware classification

Bridging the static/dynamic gap

Blake Anderson, Curtis Storlie, Terran Lane

Research output: Chapter in Book/Report/Conference proceedingConference contribution

50 Citations (Scopus)

Abstract

Malware classification systems have typically used some machine learning algorithm in conjunction with either static or dynamic features collected from the binary. Recently, more advanced malware has introduced mechanisms to avoid detection in these views by using obfuscation techniques to avoid static detection and execution-stalling techniques to avoid dynamic detection. In this paper we construct a classification framework that is able to incorporate both static and dynamic views into a unified framework in the hopes that, while a malicious executable can disguise itself in some views, disguising itself in every view while maintaining malicious intent will prove to be substantially more difficult. Our method uses kernels to place a similarity metric on each distinct view and then employs multiple kernel learning to find a weighted combination of the data sources which yields the best classification accuracy in a support vector machine classifier. Our approach opens up new avenues of malware research which will allow the research community to elegantly look at multiple facets of malware simultaneously, and which can easily be extended to integrate any new data sources that may become popular in the future.

Original languageEnglish (US)
Title of host publicationAISec'12 - Proceedings of the ACM Workshop on Security and Artificial Intelligence
Pages3-14
Number of pages12
DOIs
StatePublished - 2012
Externally publishedYes
Event5th ACM Workshop on Artificial Intelligence and Security, AISec 2012 - Raleigh, NC, United States
Duration: Oct 19 2012Oct 19 2012

Other

Other5th ACM Workshop on Artificial Intelligence and Security, AISec 2012
CountryUnited States
CityRaleigh, NC
Period10/19/1210/19/12

Fingerprint

Learning algorithms
Support vector machines
Learning systems
Classifiers
Malware

Keywords

  • Computer Security
  • Machine Learning
  • Malware
  • Multiple Kernel Learning

ASJC Scopus subject areas

  • Software
  • Computer Networks and Communications

Cite this

Anderson, B., Storlie, C., & Lane, T. (2012). Improving malware classification: Bridging the static/dynamic gap. In AISec'12 - Proceedings of the ACM Workshop on Security and Artificial Intelligence (pp. 3-14) https://doi.org/10.1145/2381896.2381900

Improving malware classification : Bridging the static/dynamic gap. / Anderson, Blake; Storlie, Curtis; Lane, Terran.

AISec'12 - Proceedings of the ACM Workshop on Security and Artificial Intelligence. 2012. p. 3-14.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Anderson, B, Storlie, C & Lane, T 2012, Improving malware classification: Bridging the static/dynamic gap. in AISec'12 - Proceedings of the ACM Workshop on Security and Artificial Intelligence. pp. 3-14, 5th ACM Workshop on Artificial Intelligence and Security, AISec 2012, Raleigh, NC, United States, 10/19/12. https://doi.org/10.1145/2381896.2381900
Anderson B, Storlie C, Lane T. Improving malware classification: Bridging the static/dynamic gap. In AISec'12 - Proceedings of the ACM Workshop on Security and Artificial Intelligence. 2012. p. 3-14 https://doi.org/10.1145/2381896.2381900
Anderson, Blake ; Storlie, Curtis ; Lane, Terran. / Improving malware classification : Bridging the static/dynamic gap. AISec'12 - Proceedings of the ACM Workshop on Security and Artificial Intelligence. 2012. pp. 3-14
@inproceedings{7bb650e0dfb9427a989d92cc14cffc05,
title = "Improving malware classification: Bridging the static/dynamic gap",
abstract = "Malware classification systems have typically used some machine learning algorithm in conjunction with either static or dynamic features collected from the binary. Recently, more advanced malware has introduced mechanisms to avoid detection in these views by using obfuscation techniques to avoid static detection and execution-stalling techniques to avoid dynamic detection. In this paper we construct a classification framework that is able to incorporate both static and dynamic views into a unified framework in the hopes that, while a malicious executable can disguise itself in some views, disguising itself in every view while maintaining malicious intent will prove to be substantially more difficult. Our method uses kernels to place a similarity metric on each distinct view and then employs multiple kernel learning to find a weighted combination of the data sources which yields the best classification accuracy in a support vector machine classifier. Our approach opens up new avenues of malware research which will allow the research community to elegantly look at multiple facets of malware simultaneously, and which can easily be extended to integrate any new data sources that may become popular in the future.",
keywords = "Computer Security, Machine Learning, Malware, Multiple Kernel Learning",
author = "Blake Anderson and Curtis Storlie and Terran Lane",
year = "2012",
doi = "10.1145/2381896.2381900",
language = "English (US)",
isbn = "9781450316644",
pages = "3--14",
booktitle = "AISec'12 - Proceedings of the ACM Workshop on Security and Artificial Intelligence",

}

TY - GEN

T1 - Improving malware classification

T2 - Bridging the static/dynamic gap

AU - Anderson, Blake

AU - Storlie, Curtis

AU - Lane, Terran

PY - 2012

Y1 - 2012

N2 - Malware classification systems have typically used some machine learning algorithm in conjunction with either static or dynamic features collected from the binary. Recently, more advanced malware has introduced mechanisms to avoid detection in these views by using obfuscation techniques to avoid static detection and execution-stalling techniques to avoid dynamic detection. In this paper we construct a classification framework that is able to incorporate both static and dynamic views into a unified framework in the hopes that, while a malicious executable can disguise itself in some views, disguising itself in every view while maintaining malicious intent will prove to be substantially more difficult. Our method uses kernels to place a similarity metric on each distinct view and then employs multiple kernel learning to find a weighted combination of the data sources which yields the best classification accuracy in a support vector machine classifier. Our approach opens up new avenues of malware research which will allow the research community to elegantly look at multiple facets of malware simultaneously, and which can easily be extended to integrate any new data sources that may become popular in the future.

AB - Malware classification systems have typically used some machine learning algorithm in conjunction with either static or dynamic features collected from the binary. Recently, more advanced malware has introduced mechanisms to avoid detection in these views by using obfuscation techniques to avoid static detection and execution-stalling techniques to avoid dynamic detection. In this paper we construct a classification framework that is able to incorporate both static and dynamic views into a unified framework in the hopes that, while a malicious executable can disguise itself in some views, disguising itself in every view while maintaining malicious intent will prove to be substantially more difficult. Our method uses kernels to place a similarity metric on each distinct view and then employs multiple kernel learning to find a weighted combination of the data sources which yields the best classification accuracy in a support vector machine classifier. Our approach opens up new avenues of malware research which will allow the research community to elegantly look at multiple facets of malware simultaneously, and which can easily be extended to integrate any new data sources that may become popular in the future.

KW - Computer Security

KW - Machine Learning

KW - Malware

KW - Multiple Kernel Learning

UR - http://www.scopus.com/inward/record.url?scp=84869822279&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84869822279&partnerID=8YFLogxK

U2 - 10.1145/2381896.2381900

DO - 10.1145/2381896.2381900

M3 - Conference contribution

SN - 9781450316644

SP - 3

EP - 14

BT - AISec'12 - Proceedings of the ACM Workshop on Security and Artificial Intelligence

ER -