A Semi-Automatic Light-Weight Approach Towards Data Generation for a Domain-Specific FAQ Chatbot Using Human-in-the-Loop

Anum Afzal, Tao Xiang, Florian Matthes

2024

Abstract

Employees at large companies tend to have longer waiting times if they need company-specific information and similarly someone on the other end needs to manually address those queries. Most companies are trying to incorporate LLM-powered conversational agents to make this processing faster but often struggle to find appropriate training data, especially domain-specific data. This paper introduces a semi-automatic approach for generating domain-specific training data while leveraging a domain-expert as a human-in-the-loop for quality control. We test this approach on a HR use-case of a large organization through a retrieval-based question-answering pipeline. Additionally, we also test the effect of long context on the performance of the FAQ chat for which we employ LongT5, an Efficient Transformer. Our experiments using LongT5 show that the inclusion of the generated training data improves the performance of the FAQ chatbot during inference.

Download


Paper Citation


in Harvard Style

Afzal A., Xiang T. and Matthes F. (2024). A Semi-Automatic Light-Weight Approach Towards Data Generation for a Domain-Specific FAQ Chatbot Using Human-in-the-Loop. In Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART; ISBN 978-989-758-680-4, SciTePress, pages 42-49. DOI: 10.5220/0012266100003636


in Bibtex Style

@conference{icaart24,
author={Anum Afzal and Tao Xiang and Florian Matthes},
title={A Semi-Automatic Light-Weight Approach Towards Data Generation for a Domain-Specific FAQ Chatbot Using Human-in-the-Loop},
booktitle={Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART},
year={2024},
pages={42-49},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012266100003636},
isbn={978-989-758-680-4},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART
TI - A Semi-Automatic Light-Weight Approach Towards Data Generation for a Domain-Specific FAQ Chatbot Using Human-in-the-Loop
SN - 978-989-758-680-4
AU - Afzal A.
AU - Xiang T.
AU - Matthes F.
PY - 2024
SP - 42
EP - 49
DO - 10.5220/0012266100003636
PB - SciTePress