
tion. In Proceedings of the 26th International Sympo-
sium on Research in Attacks, Intrusions and Defenses,
pages 654–668. ACM.
Cheng, X., Wang, H., Hua, J., Xu, G., and Sui, Y.
(2021). DeepWukong: Statically Detecting Software
Vulnerabilities Using Deep Graph Neural Network.
30(3):38:1–38:33.
Croft, R., Babar, M. A., and Kholoosi, M. M. (2023). Data
Quality for Software Vulnerability Datasets. In 2023
IEEE/ACM 45th International Conference on Soft-
ware Engineering (ICSE), pages 121–133.
Curto, C., Giordano, D., Indelicato, D. G., and Patatu, V.
(2024a). Can a Llama Be a Watchdog? Exploring
Llama 3 and Code Llama for Static Application Secu-
rity Testing. In 2024 IEEE International Conference
on Cyber Security and Resilience (CSR), pages 395–
400.
Curto, C., Giordano, D., Palazzo, S., and Indelicato,
D. (2024b). MultiVD: A Transformer-based Mul-
titask Approach for Software Vulnerability Detec-
tion. In Proceedings of the 21st International Confer-
ence on Security and Cryptography, pages 416–423.
SCITEPRESS - Science and Technology Publications.
Fan, J., Li, Y., Wang, S., and Nguyen, T. N. (2020). A c
/ c ++ code vulnerability dataset with code changes
and cve summaries. In IEEE/ACM 17th International
Conference on Mining Software Repositories (MSR),
pages 508–512. ACM.
Feng, Z., Guo, D., Tang, D., Duan, N., Feng, X., Gong, M.,
Shou, L., Qin, B., Liu, T., Jiang, D., and Zhou, M.
(2020). Codebert : A pre-trained model for program-
ming and natural languages. Findings of EMNLP.
Fu, M. and Tantithamthavorn, C. (2022). Linevul: A
transformer-based line-level vulnerability prediction.
In 2022 IEEE/ACM 19th International Conference on
Mining Software Repositories (MSR). IEEE.
Guo, D., Ren, S., Lu, S., Feng, Z., Tang, D., Liu, S., Zhou,
L., Duan, N., Svyatkovskiy, A., Fu, S., Tufano, M.,
Deng, S. K., Clement, C., Drain, D., Sundaresan, N.,
Yin, J., Jiang, D., and Zhou, M. (2020). GraphCode-
BERT: Pre-training Code Representations with Data
Flow.
Hin, D., Kan, A., Chen, H., and Babar, M. A. (2022).
LineVD: Statement-level vulnerability detection us-
ing graph neural networks. In Proceedings of the 19th
International Conference on Mining Software Reposi-
tories, pages 596–607. ACM.
HuggingFace (2024). CodeBERTa. https://huggingface.co/
huggingface/CodeBERTa-small-v1. accessed: 2024-
09.
Husain, H., Wu, H.-H., Gazit, T., Allamanis, M., and
Brockschmidt, M. (2019). CodeSearchNet Chal-
lenge: Evaluating the State of Semantic Code Search.
arXiv:1909.09436 [cs, stat]. arXiv: 1909.09436.
Jiang, X., Wu, L., Sun, S., Li, J., Xue, J., Wang, Y., Wu,
T., and Liu, M. (2025). Investigating Large Language
Models for Code Vulnerability Detection: An Experi-
mental Study.
Mamede, C., Pinconschi, E., Abreu, R., and Campos, J.
(2022). Exploring transformers for multi-label clas-
sification of java vulnerabilities. In IEEE, editor,
2022 IEEE 22nd International Conference on Soft-
ware Quality , Reliability and Security ( QRS ), pages
43–52.
MITRE (2025). CVE published by year. https://www.
cvedetails.com/browse-by-date.php. Accessed: 2025-
01.
National Institute of Standards and Technology
(2025). Nist software assurance reference dataset.
https://samate.nist.gov/SARD. Accessed: 2025-01.
Nguyen, V.-A., Nguyen, D. Q., Nguyen, V., Le, T., Tran,
Q. H., and Phung, D. (2022). ReGVD: Revisit-
ing graph neural networks for vulnerability detec-
tion. In Proceedings of the ACM/IEEE 44th Interna-
tional Conference on Software Engineering: Compan-
ion Proceedings, ICSE ’22, pages 178–182. Associa-
tion for Computing Machinery.
Ni, C., Shen, L., Yang, X., Zhu, Y., and Wang, S. (2024).
MegaVul: A C/C++ Vulnerability Dataset with Com-
prehensive Code Representations. In 2024 IEEE/ACM
21st International Conference on Mining Software
Repositories (MSR), pages 738–742.
NIST (2025). NVD database. https://nvd.nist.gov/. Ac-
cessed: 2025-01.
RedHat (2025). Red Hat Bugzilla website. https://bugzilla.
redhat.com/. Accessed: 2025-01.
Rozi
`
ere, B., Gehring, J., Gloeckle, F., Sootla, S., Gat, I.,
Tan, X. E., Adi, Y., Liu, J., Remez, T., Rapin, J.,
Kozhevnikov, A., Evtimov, I., Bitton, J., Bhatt, M.,
Ferrer, C. C., Grattafiori, A., Xiong, W., D
´
efossez,
A., Copet, J., Azhar, F., Touvron, H., Martin, L.,
Usunier, N., Scialom, T., and Synnaeve, G. (2023).
Code Llama: Open Foundation Models for Code.
Snyk (2025). Snyk website. https://snyk.io/. Accessed:
2025-01.
Xu, F. F., Alon, U., Neubig, G., and Hellendoorn, V. J.
(2022). A systematic evaluation of large language
models of code. In Proceedings of the 6th ACM
SIGPLAN International Symposium on Machine Pro-
gramming, MAPS 2022, pages 1–10. Association for
Computing Machinery.
Zheng, Y., Pujar, S., Lewis, B., Buratti, L., Epstein, E.,
Yang, B., Laredo, J., Morari, A., and Su, Z. (2021).
D2A: A Dataset Built for AI-Based Vulnerability De-
tection Methods Using Differential Analysis. In 2021
IEEE/ACM 43rd International Conference on Soft-
ware Engineering: Software Engineering in Practice
(ICSE-SEIP), pages 111–120.
Zhou, Y., Liu, S., Siow, J., Du, X., and Liu, Y. (2019). De-
vign: Effective Vulnerability Identification by Learn-
ing Comprehensive Program Semantics via Graph
Neural Networks. In Advances in Neural Information
Processing Systems, volume 32. Curran Associates,
Inc.
SECRYPT 2025 - 22nd International Conference on Security and Cryptography
362