Automated Evaluation of Database Conversational Agents
Matheus O. Silva, Eduardo Nascimento, Eduardo Nascimento, Yenier Izquierdo, Melissa Lemos, Melissa Lemos, Marco Casanova, Marco Casanova
2025
Abstract
Database conversational agents support dialogues to help users interact with databases in their jargon. A strategy to construct such agents is to adopt an LLM-based architecture. However, evaluating agent-based systems is complex and lacks a definitive solution, as responses from such systems are open-ended, with no direct relationship between input and the expected response. This paper then focuses on the problem of evaluating LLM-based database conversational agents. It first introduces a tool to construct test datasets for such agents that explores the schema and the data values of the underlying database. The paper then describes an evaluation agent that behaves like a human user to assess the responses of a database conversational agent on a test dataset. Finally, the paper includes a proof-of-concept experiment with an implementation of a database conversational agent over two databases, the Mondial database and an industrial database in production at an energy company.
DownloadPaper Citation
in Harvard Style
Silva M., Nascimento E., Izquierdo Y., Lemos M. and Casanova M. (2025). Automated Evaluation of Database Conversational Agents. In Proceedings of the 21st International Conference on Web Information Systems and Technologies - Volume 1: WEBIST; ISBN 978-989-758-772-6, SciTePress, pages 277-288. DOI: 10.5220/0013732900003985
in Bibtex Style
@conference{webist25,
author={Matheus Silva and Eduardo Nascimento and Yenier Izquierdo and Melissa Lemos and Marco Casanova},
title={Automated Evaluation of Database Conversational Agents},
booktitle={Proceedings of the 21st International Conference on Web Information Systems and Technologies - Volume 1: WEBIST},
year={2025},
pages={277-288},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013732900003985},
isbn={978-989-758-772-6},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 21st International Conference on Web Information Systems and Technologies - Volume 1: WEBIST
TI - Automated Evaluation of Database Conversational Agents
SN - 978-989-758-772-6
AU - Silva M.
AU - Nascimento E.
AU - Izquierdo Y.
AU - Lemos M.
AU - Casanova M.
PY - 2025
SP - 277
EP - 288
DO - 10.5220/0013732900003985
PB - SciTePress