Windows Malware Binaries in C/C++ GitHub Repositories: Prevalence and Lessons Learned

William Cholter, Matthew Elder, Antonius Stalick

2021

Abstract

Does malware lurking in GitHub pose a threat? GitHub is the most popular open source software website, having 188 million repositories. GitHub hosts malware-related projects for research and educational purposes and has also been used by malware to attack users. In this paper, we explore the prevalence of unencrypted, uncompressed binary code malware in Microsoft Windows compatible C and C++ GitHub repositories and characterize the threat. We mined 1,835 repositories for already-compiled malicious files and data suggesting whether the repository is malware-related. We focused on these repositories because Windows is frequently targeted by malware written in C or C++. These repositories are good resources for attackers and could target Windows users. We extracted all Portable Executable (PE) files from all commits and queried the malware resource VirusTotal for analysis from its 76 anti-virus engines. Of the 24,395 files, 4,335 are suspicious, with at least one detection; 440 could be considered malicious, with at least seven detections. We identify topic tags suggesting malware or offensive security content, to differentiate from seemingly benign repositories. 197 of 440 malicious executables were in 27 ostensibly benign repositories. This work illustrates risks in source code repositories and lessons learned in relating GitHub and VirusTotal data.

Download


Paper Citation


in Harvard Style

Cholter W., Elder M. and Stalick A. (2021). Windows Malware Binaries in C/C++ GitHub Repositories: Prevalence and Lessons Learned.In Proceedings of the 7th International Conference on Information Systems Security and Privacy - Volume 1: ICISSP, ISBN 978-989-758-491-6, pages 475-484. DOI: 10.5220/0010237904750484


in Bibtex Style

@conference{icissp21,
author={William Cholter and Matthew Elder and Antonius Stalick},
title={Windows Malware Binaries in C/C++ GitHub Repositories: Prevalence and Lessons Learned},
booktitle={Proceedings of the 7th International Conference on Information Systems Security and Privacy - Volume 1: ICISSP,},
year={2021},
pages={475-484},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010237904750484},
isbn={978-989-758-491-6},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 7th International Conference on Information Systems Security and Privacy - Volume 1: ICISSP,
TI - Windows Malware Binaries in C/C++ GitHub Repositories: Prevalence and Lessons Learned
SN - 978-989-758-491-6
AU - Cholter W.
AU - Elder M.
AU - Stalick A.
PY - 2021
SP - 475
EP - 484
DO - 10.5220/0010237904750484