A Novel and Dedicated Machine Learning Model for Malware Classification

Miles Q. Li, Benjamin C. M. Fung, Philippe Charland, Steven H. H. Ding

2021

Abstract

Malicious executables are comprised of functions that can be represented in assembly code. In the assembly code mining literature, many software reverse engineering tools have been created to disassemble executables, search function clones, and find vulnerabilities, among others. The development of a machine learning-based malware classification model that can simultaneously achieve excellent classification performance and provide insightful interpretation for the classification results remains to be a hot research topic. In this paper, we propose a novel and dedicated machine learning model for the research problem of malware classification. Our proposed model generates assembly code function clusters based on function representation learning and provides excellent interpretability for the classification results. It does not require a large or balanced dataset to train which meets the situation of real-life scenarios. Experiments show that our proposed approach outperforms previous state-of-the-art malware classification models and provides meaningful interpretation of classification results.

Download


Paper Citation


in Harvard Style

Li M., Fung B., Charland P. and Ding S. (2021). A Novel and Dedicated Machine Learning Model for Malware Classification. In Proceedings of the 16th International Conference on Software Technologies - Volume 1: ICSOFT, ISBN 978-989-758-523-4, pages 617-628. DOI: 10.5220/0010518506170628


in Bibtex Style

@conference{icsoft21,
author={Miles Q. Li and Benjamin C. M. Fung and Philippe Charland and Steven H. H. Ding},
title={A Novel and Dedicated Machine Learning Model for Malware Classification},
booktitle={Proceedings of the 16th International Conference on Software Technologies - Volume 1: ICSOFT,},
year={2021},
pages={617-628},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010518506170628},
isbn={978-989-758-523-4},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 16th International Conference on Software Technologies - Volume 1: ICSOFT,
TI - A Novel and Dedicated Machine Learning Model for Malware Classification
SN - 978-989-758-523-4
AU - Li M.
AU - Fung B.
AU - Charland P.
AU - Ding S.
PY - 2021
SP - 617
EP - 628
DO - 10.5220/0010518506170628