A Vulnerability Introducing Commit Dataset for Java: An Improved SZZ based Approach

Tamás Aladics, Tamás Aladics, Péter Hegedűs, Péter Hegedűs, Rudolf Ferenc

2022

Abstract

In the domain of vulnerability detection from the source code by applying static analysis, the number and quality of available datasets for creating and testing security analysis methods is quite low. To be precise, there are already several public datasets containing vulnerability fixing commits; however, vulnerability introducing commit datasets are scarce, which would be essential for creating and validating just-in-time vulnerability detection approaches. In this paper, we propose an SZZ (an algorithm originally developed to find bug introducing commits) based method with a specific filtering mechanism to create vulnerability introducing commit datasets from vulnerability fixes. The filtering phase involves measuring a relevance score for each vulnerability introducing commit candidates based on commit similarities. We generated a novel Java vulnerability introducing dataset from the existing project-KB repository to demonstrate our algorithm’s capabilities. We also showcase the generated database and the effectiveness of our filtering method through several hand-picked examples from the dataset.

Download


Paper Citation


in Harvard Style

Aladics T., Hegedűs P. and Ferenc R. (2022). A Vulnerability Introducing Commit Dataset for Java: An Improved SZZ based Approach. In Proceedings of the 17th International Conference on Software Technologies - Volume 1: ICSOFT, ISBN 978-989-758-588-3, pages 68-78. DOI: 10.5220/0011275200003266


in Bibtex Style

@conference{icsoft22,
author={Tamás Aladics and Péter Hegedűs and Rudolf Ferenc},
title={A Vulnerability Introducing Commit Dataset for Java: An Improved SZZ based Approach},
booktitle={Proceedings of the 17th International Conference on Software Technologies - Volume 1: ICSOFT,},
year={2022},
pages={68-78},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011275200003266},
isbn={978-989-758-588-3},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 17th International Conference on Software Technologies - Volume 1: ICSOFT,
TI - A Vulnerability Introducing Commit Dataset for Java: An Improved SZZ based Approach
SN - 978-989-758-588-3
AU - Aladics T.
AU - Hegedűs P.
AU - Ferenc R.
PY - 2022
SP - 68
EP - 78
DO - 10.5220/0011275200003266