Abuhamad, M., Jung, C., Mohaisen, D., Nyang, D. (2023).
SHIELD: Thwarting code authorship attribution.
ArXiv: 2304.13255
Alsulami, B., Dauber, E., Harang, R., Mancoridis, S.,
Greenstadt, R. (2017). Source code authorship
attribution using long short-term memory-based
networks. In Computer Security – ESORICS 2017 (Vol.
10492).
Álvarez-Fidalgo, D., Ortin, F. (2025). CLAVE: A deep
learning model for source code authorship verification
with contrastive learning and transformer encoders.
Information Processing & Management, 62(3),
104005-104020.
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K. (2019).
BERT: Pre-Training of Deep bidirectional transformers
for language understanding. In Proceedings of the 2019
Conference of the North American Chapter of the
Association for Computational Linguistics: Human
Language Technologies (pp. 4171–4186). Minneapolis,
MN.
Georges, A., Buytaert, D., Eeckhout, L. (2007). Statistically
rigorous Java performance evaluation. ACM SIGPLAN
Notices, 42(10), 57–76.
He, X., Lashkari, A. H., Vombatkere, N., Sharma, D. P.
(2024). Authorship attribution methods, challenges,
and future research directions: A comprehensive
survey. Information, 15(3), 131.
Hendrycks, D., Gimpel, K. (2023). Gaussian error linear
units (GELUs). ArXiv: 1606.08415v5.
Hozhabrierdi, P., Hitos, D. F., Mohan, C. K. (2020). Zero-
shot source code author identification: A lexicon- and
layout-independent approach. In 2020 International
Joint Conf. on Neural Networks (IJCNN) (pp. 1–8).
Kurtukova, A., Romanov, A., Shelupanov, A. (2020).
Source code authorship identification using deep neural
networks. Symmetry, 12(12), 2044.
Li, Z., Chen, G. Q., Chen, C., Zou, Y., Xu, S. (2022).
RoPGen: Towards robust code authorship attribution
via automatic coding style transformation. In
Proceedings of the 44th International Conference on
Software Engineering (ICSE ’22) (pp. 1906–1918).
Noma, H., Matsushima, Y., Ishii, R. (2021). Confidence
interval for the AUC of SROC curve and some related
methods using bootstrap for meta-analysis of diagnostic
accuracy studies. Communications in Statistics: Case
Studies, Data Analysis and Applications, 7, 344–358.
Ortin, F., Garcia, M. Perez-Schofield, B. G., Quiroga, J.
(2022). The StaDyn programming language.
SoftwareX, 20, 101211–101222.
Ortin, F., Facundo, G., Garcia, M. (2023). Analyzing
syntactic constructs of Java programs with machine
learning. Expert Systems with Applications, 215,
119398–119414.
Ortin, F., Álvarez-Fidalgo, D. (2025). Support webpage of
the article Efficient source code authorship attribution
using code stylometry embeddings. Retrieved from
https://www.reflection.uniovi.es/bigcode/download/20
25/icsoft2025.
Ou, W., Ding, S. H. H., Tian, Y., Song, L. (2023). SCS-
GAN: Learning functionality-agnostic stylometric
representations for source code authorship verification.
IEEE Transactions on Software Engineering, 49, 1426–
1442.
Rodola, G. (2025). Psutil: A cross-platform library for
process and system monitoring in Python. Retrieved
from https://psutil.readthedocs.io
Rodriguez-Prieto, O., Pato, A., Ortin, F. (2023) PLangRec:
Deep-learning model to predict the programming
language from a single line of code. Future Generation
Computer Systems, 166, 107640–107655.
Stamatatos, E., Kredens, K., Pezik, P., Heini, A.,
Bevendorff, J., Stein, B., Potthast, M. (2023). Overview
of the authorship verification task at PAN 2023. In
Working Notes of the Conference and Labs of the
Evaluation Forum (CLEF 2023) (pp. 2476–2491).
Thessaloniki, Greece.
Kalgutkar, V., Kaur, R., Gonzalez, H., Stakhanova, N.,
Matyukhina, A. (2019). Code authorship attribution:
Methods and challenges. ACM Computing Surveys,
52(1).
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones,
L., Gomez, A. N., Kaiser, L. U., Polosukhin, I. (2017).
Attention is all you need. In Advances in Neural
Information Processing Systems (Vol. 30). Curran
Associates, Inc.
Wang, N., Ji, S., Wang, T. (2018). Integration of static and
dynamic code stylometry analysis for programmer de-
anonymization. In Proceedings of the 11th ACM
Workshop on Artificial Intelligence and Security
(AISec ’18) (pp. 74–84).
White, R., Sprague, N. (2021). Deep metric learning for
code authorship attribution and verification. In 20
th
IEEE International Conference on Machine Learning
and Applications (ICMLA) (pp. 1089–1093).
Zhang, Z., Chen, C., Liu, B., Liao, C., Gong, Z., Yu, H., Li,
J., Wang, R. (2024). Unifying the perspectives of NLP
and software engineering: A survey on language
models for code. Transactions on Machine Learning
Research 9/2024.