A Comparative Study of ML Approaches for Detecting AI-Generated Essays
Mihai Nechita, Madalina Raschip
2025
Abstract
Recent advancements in generative AI introduced a significant challenge to academic credibility and integrity. The current paper presents a comprehensive study of traditional machine learning methods and complex neural network models such as recurrent neural networks and Transformer-based models to detect AI-generated essays. A two-step training of the Transformer-based model was proposed. The aim of the pretraining step is to move the general language model closer to our problem. The models used obtain a good AUC score for classification, outperforming the SOTA zero-shot detection approaches. The results show that Transformer architectures not only outperform other methods on the validation datasets but also exhibit increased robustness across different sampling parameters. The generalization to new datasets as well as the performance of the models at a small level of FPR was evaluated. In order to enhance transparency, the explainability of the proposed models through the LIME and SHAP approaches was explored.
DownloadPaper Citation
in Harvard Style
Nechita M. and Raschip M. (2025). A Comparative Study of ML Approaches for Detecting AI-Generated Essays. In Proceedings of the 14th International Conference on Data Science, Technology and Applications - Volume 1: DATA; ISBN 978-989-758-758-0, SciTePress, pages 144-155. DOI: 10.5220/0013570200003967
in Bibtex Style
@conference{data25,
author={Mihai Nechita and Madalina Raschip},
title={A Comparative Study of ML Approaches for Detecting AI-Generated Essays},
booktitle={Proceedings of the 14th International Conference on Data Science, Technology and Applications - Volume 1: DATA},
year={2025},
pages={144-155},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013570200003967},
isbn={978-989-758-758-0},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 14th International Conference on Data Science, Technology and Applications - Volume 1: DATA
TI - A Comparative Study of ML Approaches for Detecting AI-Generated Essays
SN - 978-989-758-758-0
AU - Nechita M.
AU - Raschip M.
PY - 2025
SP - 144
EP - 155
DO - 10.5220/0013570200003967
PB - SciTePress