Authors:
Augusto Gonzalez-Bonorino
1
;
2
;
Eitel J. M. Lauría
2
and
Edward Presutti
3
Affiliations:
1
Data Science & Analytics Dept., Information Technology, Marist College, Poughkeepsie, New York, U.S.A.
;
2
School of Computer Science and Mathematics, Marist College, Poughkeepsie, New York, U.S.A.
;
3
Enrollment Management, Marist College, Poughkeepsie, New York, U.S.A.
Keyword(s):
Deep Learning, Natural Language Processing, Transformers, AI in Higher-education, Open Domain Question-Answering, BERT, roBERTa, ELECTRA, Minilm, Information Retrieval.
Abstract:
Advances in Artificial Intelligence and Natural Language Processing (NLP) can be leveraged by higher-ed administrators to augment information-driven support services. But due to the incredibly rapid innovation rate in the field, it is challenging to develop and implement state-of-the-art systems in such institutions. This work describes an end-to-end methodology that educational institutions can utilize as a roadmap to implement open domain question-answering (ODQA) to develop their own intelligent assistants on their online platforms. We show that applying a retriever-reader framework composed of an information retrieval component that encodes sparse document vectors, and a reader component based on BERT -Bidirectional Encoder Representations from Transformers- fine-tuned with domain specific data, provides a robust, easy-to-implement architecture for ODQA. Experiments are carried out using variations of BERT fine-tuned with a corpus of questions and answers derived from our institu
tion’s website.
(More)