Analysis of the Effectiveness of Large Language Models in Assessing Argumentative Writing and Generating Feedback

Daisy Albuquerque da Silva, Carlos Eduardo de Mello, Ana Garcia

2024

Abstract

This study examines the use of Large Language Models (LLMs) like GPT-4 in the evaluation of argumentative writing, particularly opinion articles authored by military school students. It explores the potential of LLMs to provide instant, personalized feedback across different writing stages and assesses their effectiveness compared to human evaluators. The study utilizes a detailed rubric to guide the LLM evaluation, focusing on competencies from topic choice to bibliographical references. Initial findings suggest that GPT-4 can consistently evaluate technical and structural aspects of writing, offering reliable feedback, especially in the References category. However, its conservative classification approach may underestimate article quality, indicating a need for human oversight. The study also uncovers GPT-4’s challenges with nuanced and contextual elements of opinion writing, evident from variability in precision and low recall in recognizing complete works. These findings highlight the evolving role of LLMs as supplementary tools in education that require integration with human judgment to enhance argumentative writing and critical thinking in academic settings.

Download


Paper Citation


in Harvard Style

Albuquerque da Silva D., Eduardo de Mello C. and Garcia A. (2024). Analysis of the Effectiveness of Large Language Models in Assessing Argumentative Writing and Generating Feedback. In Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART; ISBN 978-989-758-680-4, SciTePress, pages 573-582. DOI: 10.5220/0012466600003636


in Bibtex Style

@conference{icaart24,
author={Daisy Albuquerque da Silva and Carlos Eduardo de Mello and Ana Garcia},
title={Analysis of the Effectiveness of Large Language Models in Assessing Argumentative Writing and Generating Feedback},
booktitle={Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART},
year={2024},
pages={573-582},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012466600003636},
isbn={978-989-758-680-4},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART
TI - Analysis of the Effectiveness of Large Language Models in Assessing Argumentative Writing and Generating Feedback
SN - 978-989-758-680-4
AU - Albuquerque da Silva D.
AU - Eduardo de Mello C.
AU - Garcia A.
PY - 2024
SP - 573
EP - 582
DO - 10.5220/0012466600003636
PB - SciTePress