A Comparative Study of ML Approaches for Detecting AI-Generated Essays

Mihai Nechita, Madalina Raschip

2025

Abstract

Recent advancements in generative AI introduced a significant challenge to academic credibility and integrity. The current paper presents a comprehensive study of traditional machine learning methods and complex neural network models such as recurrent neural networks and Transformer-based models to detect AI-generated essays. A two-step training of the Transformer-based model was proposed. The aim of the pretraining step is to move the general language model closer to our problem. The models used obtain a good AUC score for classification, outperforming the SOTA zero-shot detection approaches. The results show that Transformer architectures not only outperform other methods on the validation datasets but also exhibit increased robustness across different sampling parameters. The generalization to new datasets as well as the performance of the models at a small level of FPR was evaluated. In order to enhance transparency, the explainability of the proposed models through the LIME and SHAP approaches was explored.

Download


Paper Citation


in Harvard Style

Nechita M. and Raschip M. (2025). A Comparative Study of ML Approaches for Detecting AI-Generated Essays. In Proceedings of the 14th International Conference on Data Science, Technology and Applications - Volume 1: DATA; ISBN 978-989-758-758-0, SciTePress, pages 144-155. DOI: 10.5220/0013570200003967


in Bibtex Style

@conference{data25,
author={Mihai Nechita and Madalina Raschip},
title={A Comparative Study of ML Approaches for Detecting AI-Generated Essays},
booktitle={Proceedings of the 14th International Conference on Data Science, Technology and Applications - Volume 1: DATA},
year={2025},
pages={144-155},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013570200003967},
isbn={978-989-758-758-0},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 14th International Conference on Data Science, Technology and Applications - Volume 1: DATA
TI - A Comparative Study of ML Approaches for Detecting AI-Generated Essays
SN - 978-989-758-758-0
AU - Nechita M.
AU - Raschip M.
PY - 2025
SP - 144
EP - 155
DO - 10.5220/0013570200003967
PB - SciTePress