Conversational Interfaces in Serious Games: Identifying Potentials and

Future Research Directions based on a Systematic Literature Review

Barbara G

obl

1 a

, Simone Kriglstein

1,2 b

and Helmut Hlavacs

1 c

Faculty of Computer Science, University of Vienna, Austria

Faculty of Informatics, Mazaryk University, Czech Republic

Keywords:

Natural Language Interfaces, Conversational Interfaces, Conversational Agents, Chatbots, Pedagogical

Agents, Serious Games, Game-based Learning, Player-Computer Interaction.

Abstract:

Conversational interfaces have become a popular approach to human-computer interaction in recent years.

While currently often employed in a business context or as personal assistants, recent advances show that their

application potential goes far beyond that. The following paper presents a systematic literature review on the

integration of conversational interfaces using natural language in serious games. We provide an overview of

application domains, designer’s motivations and reasoning. Finally, we summarize potentials and pitfalls of

this technology in serious games, identify research gaps and suggest directions for further research.

1 INTRODUCTION

In recent years, conversational interfaces (CIs), often

also referred to as chatbots, have received increased

attention as a means to create engaging human-

computer interactions, guaranteeing ease-of-use and

promising new, exciting experiences (Zadrozny et al.,

2000; Følstad and Brandtzæg, 2017).Additionally, the

popularity of messaging platforms not only provides

developers with a wide range of implementation op-

tions but also underlines the wide-spread use of CIs

(Shevat, 2017; Smutny and Schreiberova, 2020).

Among the many application domains, educa-

tional approaches are of increasing interest. We want

to explore a speciﬁc ﬁeld in this area, namely serious

games (SGs). This paper focuses on the implemen-

tation of serious game CIs as natural language inter-

faces (NLIs), which allow the players to interact with

the game based on text or speech. This approach may

complement game-based learning in promising new

ways, such as addressing the often daunting amounts

of text in SGs by adding interactive dialogue. Ad-

ditionally, pedagogical chatbots might proﬁt from an

engaging game environment to invite players to inter-

act with them. In the course of this work, we aim to

answer the following research questions:

https://orcid.org/0000-0001-7186-076X

https://orcid.org/0000-0001-7817-5589

https://orcid.org/0000-0001-6837-674X

RQ 1: What potentials and drawbacks does the com-

bination of natural language-based conversational in-

terfaces and serious games hold?

RQ 2: What aspects of natural language-based con-

versational interfaces in serious games need further

examination?

The following paper will present a short back-

ground on CI classiﬁcation and evaluation, leading

up to a closer examination in the ﬁeld of SGs. We

conducted a systematic literature review and provide

an analysis of the application of game-based learning,

conversational agents and NLIs and summarize ﬁnd-

ings from presented evaluations. We identify promis-

ing new areas, potential pitfalls and point out how fu-

ture projects may contribute to further bridge the use

of NLIs and game-based learning.

2 BACKGROUND

This section will provide a short background on con-

versational systems, or chatbots, reﬂecting both the

trends described above and the focus of the literature

review presented below. To allow for a systematic ap-

proach to this broad ﬁeld, we examine the roles such

systems undertake, i.e. their intended purpose, the

modalities, i.e. types of interfaces, how they are im-

plemented and eventually, evaluated.

Conversational interfaces may be differentiated

108

Göbl, B., Kriglstein, S. and Hlavacs, H.

Conversational Interfaces in Serious Games: Identifying Potentials and Future Research Directions based on a Systematic Literature Review.

DOI: 10.5220/0010447301080115

In Proceedings of the 13th International Conference on Computer Supported Education (CSEDU 2021) - Volume 1, pages 108-115

ISBN: 978-989-758-502-9

according to their goals, namely task and non-task ori-

ented (Chen et al., 2017). While the former serves a

speciﬁc purpose, the latter mainly aims to entertain

and chat with the user. Educational conversational

agents may be considered as a subcategory of task-

oriented approaches (Taouil et al., 2018). Conver-

sation may be user- or agent-driven (Følstad et al.,

2018) and is implemented in different modalities:

voice- or text-based modes (Shevat, 2017; Hussain

et al., 2019; Piccolo et al., 2018), or even rich in-

terfaces that integrate buttons, text and graphics to

support conversations (Shevat, 2017). Eventually,

designers will also have to decide on the technical

foundation of the system. Next to considerations re-

garding technical infrastructure and implementation

platform, three types of dialogue handling have been

identiﬁed: rule-based approaches, retrieval-based ap-

proaches and generative-based approaches (Hussain

et al., 2019; Chen et al., 2017; Adiwardana et al.,

2020). Technical implementation is strongly depen-

dant on the types of interaction, tasks and other re-

quirements.

Evaluation of conversational interfaces spans a

large spectrum of aspects and techniques and has

been extensively surveyed in previous work (see e.g.

(Peras, 2018; Ren et al., 2019)). Considering the

claim that conversational interfaces might be the next

logical step in interface design (Shevat, 2017), one

may use established usability metrics such as effec-

tiveness, efﬁciency and satisfaction. A recent study

reﬁnes this approach and maps metrics to usability

characteristics effectiveness (e.g task completion, ac-

ccuracy, expert and users assessment), efﬁciency (e.g.

task completion time, mental effort, communication

effort) and satisfaction (e.g. ease-of-use, context-

dependant questions, ’want to use again’) (Ren et al.,

2019). A further point of discrimination is the time

at which the CI is evaluated: many of the mentioned

evaluation techniques focus on a ﬁnished product,

while some used evaluations as an intermediary step

in design (Maroengsit et al., 2019). Especially in iter-

ative processes, intermediary evaluations might help

to adapt the design and be based on low- or high-

ﬁdelity prototypes, or even help to plan the implemen-

tation. As an early step in the design of conversational

interfaces, so called Wizard of Oz studies may be ap-

plied. In these studies, a human (the wizard) takes

over the role of the bot and controls the dialogue with

the user (Dahlb

ack et al., 1993).

Regarding games, dialogue and chat-based inter-

action have a long history, ranging from text-based

adventures to dialogue-rich modern day games. How-

ever, their integration into serious games and the inte-

gration of natural language processing certainly poses

a set of new challenges and tasks for this technol-

ogy. In an educational context, conversational inter-

faces have been implemented for various applications.

For example, as a motivational tool in language learn-

ing courses (Fryer et al., 2017), to lower barriers for

teenagers in sex education (Crutzen et al., 2011) or to

address psychological issues, such as stress release or

to remedy symptoms of depression and anxiety (Fitz-

patrick et al., 2017; Huang et al., 2015). However,

these approaches do not provide a game environment

that embeds these conversations and are not further

scrutinized in this paper.

3 METHODOLOGY

To identify and analyse works that implement conver-

sational interfaces in a game setting, we conducted

a systematic literature review. Following query was

used: (”chat bot” OR ”chatbot” OR ”chatbot” OR

”dialogue system” OR ”conversational agent” OR

”conversational agents” OR ”natural language”)

AND (”serious games” OR ”serious game” OR

”game-based learning” OR ”purposeful games”). A

search in databases SCOPUS, IEEE Explore, ACM

Digital Library, SpringerLink resulted in a total num-

ber of 124 papers. After elimination of duplicates and

further analysis, a body of N = 30 works matched the

criteria of NLIs embedded in serious games. Publica-

tion years ranged from 2008 to 2020. As can be seen

in Table 1, some papers described different aspects of

the same NLI. Overall, the body of work described

23 individual conversational agents. Different imple-

mentations or iteration steps where treated as separate

entries in Table 1, whereas papers describing different

aspects of the same implementation where summa-

rized in one entry. After closer inspection, we found

that 13 of these works presented evaluations that in-

cluded aspects of the integrated NLI. Thus, a total of

17 papers focused on description of design, architec-

ture and implementation or uses cases or evaluated

unrelated aspects of the serious game. Due to their

similar motivations and the holistic evaluations, so-

lutions where only parts of the conversational inter-

face where based on natural language interaction were

considered as well.

4 RESULTS

Below, we analyse the resulting body of work (N =

30) regarding application domains, types of evalua-

tion, and speciﬁc purpose of the integrated conversa-

tional interfaces. To summarize how CIs may sup-

Conversational Interfaces in Serious Games: Identifying Potentials and Future Research Directions based on a Systematic Literature Review

109

Table 1: On overiew of the analysed body of literature.

Modality Purpose Motivations Application Domain Eval Eval.Lvl Authors

EA soc & comm sim Health, Mental Health

(Augello et al., 2016c), (Augello et al., 2016a),

(Augello et al., 2016b)

EA, text know imm, learn History and Cultural Heritage (Bellotti et al., 2011)

EA know imm, learn History and Cultural Heritage (Bellotti et al., 2008)

rich controls eng IT Literacy yes 1, 2 (Berger et al., 2019)

EA, text soc & comm sim General Professional Skills yes 1 (Callejas et al., 2014)

text assess, analyse Microbiology yes 2 (Carpenter et al., 2020)

voice assess, analyse Speech Analysis (Corrales-Astorgano et al., 2018)

EA, text interact eng, mot Politics & Law (Economou et al., 2014),(Economou et al., 2015)

EA know sim, learn Research Skills yes Other (Forsyth et al., 2015)

EA General Professional Skills (Jepp et al., 2010)

EA soc & comm General yes Other (Lala et al., 2019)

EA, voice learn Microbiology yes 2 (Lee et al., 2010)

EA, text know imm History and Cultural Heritage yes 1, 2 (Mori et al., 2013)

EA, voice know imm, learn History and Cultural Heritage yes (Neto et al., 2015)

text soc & comm sim General Professional Skills (Othlinghaus and Hoppe, 2016)

EA interact imm, ia History and Cultural Heritage (Panzoli et al., 2010)

EA, voice interact sim, imm, learn Language Learning (Powers et al., 2008)

text sim Plastics Industry yes (Rojas-Barahona and Cerisara, 2014)

voice assess, analyse Disaster Awareness 1 (Sermet and Demir, 2019)

EA, voice controls IT Literacy (Stefanidi et al., 2020)

sim Learning Disabilities (Taouil et al., 2018)

EA eng, mot Health, Mental Health (Tironi et al., 2019)

text assess, analyse learn General yes 1 (Toma et al., 2018)

voice assess, analyse ia General yes 1 (Toncu et al., 2020)

text controls acc General (Torrente et al., 2012)

EA, text

soc & comm,

assess, analyse

sim General Professional Skills yes (Vaassen and Daelemans, 2010)

text know learn Microbiology yes 1, 2 (Wiggins et al., 2019)

Modality: EA = embodied agent; Purpose: soc & comm = social and communication skills, know = present knowledge, controls = language-based commands, interact =

learning through interaction, assess, analyse = automated assessment and analysis;Motivation: sim = simulate interaction, imm = immersion learn = improve learning outcom,

eng = engagement, mot = motivation, ia = improve interaction, acc = accessability, Evaluation Level: 1 - reaction, 2 - learning (Kirkpatrick and Kirkpatrick, 2006)

CSEDU 2021 - 13th International Conference on Computer Supported Education

110

port the goals of the serious game, we further supple-

ment this data with the motivations the authors state

for the integration of a CIs. We considered moti-

vations if explicitly mentioned or if authors under-

lined these aspects in relevant related works. Purposes

of the bots where categorized based on our analysis

of the examined works and summarized into 5 cat-

egories describing the NLI’s task within the serious

game: training social and communication skills, pre-

senting knowledge via dialogue, learning through in-

teraction, game controls, and assessment and analy-

sis of player-provided data. Analysed works imple-

mented NLIs using three different modalities, namely

embodied agents (EA), text-based and voice-based in-

terfaces. Some works did not explicitly state whether

their design relies on voice- or text-based interaction.

4.1 Application Domains, Purpose and

Motivations

The analysed body of work suggests a broad range of

applications domains, ranging from health and mental

health to information technology (IT) literacy. Fur-

thermore, a few motives come up repeatedly in the ar-

gumentation for the inclusion of CIs such as the sim-

ulation of human interaction (n = 8), improving im-

mersion (n = 6), improving motivation (n = 2) and en-

gagement (n = 3), improve learning outcomes (n = 8)

or improving interaction (n = 2). Several works did

not include speciﬁcs on their motivation to include

CIs.

CIs’ tasks may directly relate to the serious game’s

goal, i.e. in the form of presentation of knowledge

through dialogue. For example, analysed works in the

sector of history and cultural heritage feature embod-

ied agents, and respective authoring tools, that convey

historical or cultural facts throughout the conversation

with the player (Bellotti et al., 2008; Bellotti et al.,

2008; Mori et al., 2013). In other works, the inter-

action itself and related observation and exploration,

rather than the dialogue’s content, aim to support the

learning goals. This is the case for social and com-

municative skill training but also serious game envi-

ronments that allow the player to observe, explore and

interact in self-determined approaches (Powers et al.,

2008; Panzoli et al., 2010). A number of reviewed

works (n = 6) also includes speech- or text-based in-

terfaces to gather natural language data as a basis for

assessment and analysis, regarding e.g. the depth of

reﬂection. Furthermore, NLIs often fulﬁll tasks re-

lated to game progress and navigation, e.g. by han-

dling natural language player commands (n = 3).

Provided details on the technical implementations

varied widely in the examined works, thus complicat-

ing overall comparison. However, stated techniques

included AIML (Artiﬁcial Intelligence Markup Lan-

guage

) (Augello et al., 2016c; Othlinghaus and

Hoppe, 2016), text-retrieval based approaches and

metrics (e.g. ITF-IDF, (Bellotti et al., 2011; Mori

et al., 2013)) or the use of external solutions and ser-

vice providers such as Google Dialogﬂow

(Tironi

et al., 2019), wit.ai

(Toncu et al., 2020) or the

Microsoft Bot Framework

(Berger et al., 2019).

For automated assessment and analysis, toolkits such

as scikit-learn

(Carpenter et al., 2020) and openS-

MILE

(Corrales-Astorgano et al., 2018) were used.

4.2 Evaluation and Findings

Most papers present an evaluation of the overall expe-

rience and do not look at the CI’s and serious game’s

aspects separately. Thus, we based our analysis of the

level of evaluation on the model of training evaluation

(Kirkpatrick and Kirkpatrick, 2006).

Evaluations focusing on Kirkpatrick’s ﬁrst level,

which relates to the players’ ”reaction” to the serious

game, mostly concentrated on feedback regarding us-

ability and satisfaction of the overall serious game.

Evaluation of affective interaction, emotion classiﬁ-

cation and inducing emotion suggested that NLIs sup-

port the cause of affective interaction. One study

found that combined verbal and non-verbal behaviour

of embodied agents successfully conveyed social atti-

tudes during virtual job interviews: In order to chal-

lenge and comfort the player, the embodied agent and

its CI managed to support the portrayal of friendly,

neutral and hostile attitudes (Callejas et al., 2014).

By manipulation affect in pedagogical agents’ dis-

course, simulating different moods, researchers man-

aged to ﬁnd a correlation between learner’s arousal

levels and learning outcomes (Forsyth et al., 2015).

Additionally, pointing out the relevance of emotional

aspects, different machine learning algorithms have

been tested to identify optimal approaches to emo-

tion classiﬁcation based on natural language input

(Vaassen and Daelemans, 2010). A model trained

with manually annotated emotion classiﬁcations for

sentences helped to identify a memory-based ap-

proach, TiMBL (Daelemans et al., 2005), as best ma-

chine learning solution to automate this task. Sim-

ilarly, in an assessment of reﬂection-level of natu-

ral language input, written in-game, the potential of

http://www.aiml.foundation/

https://cloud.google.com/dialogﬂow

https://wit.ai/

https://dev.botframework.com/

https://scikit-learn.org/stable/

https://www.audeering.com/opensmile/

Conversational Interfaces in Serious Games: Identifying Potentials and Future Research Directions based on a Systematic Literature Review

111

various algorithms to accurately determine reﬂection

level in natural language data is demonstrated (Car-

penter et al., 2020). Usability tests in regard to the

overall game setting including CIs report positive at-

titudes and interest towards the chosen learning ap-

proaches (Mori et al., 2013; Toma et al., 2018) and re-

port feedback commending entertainment factor and

the innovative approach (Toncu et al., 2020). How-

ever, these works also demonstrate some drawbacks

of the approach: for one, participants in a case study

including automated assessment reported that they

lack feedback to understand the automated process

(Toma et al., 2018). For another, around 50% of par-

ticipants in one study reported negative feelings about

mismatched answers and a system’s failure to cor-

rectly interpret commands (Toncu et al., 2020). In

the game ”Communicate”, a system to provide high-

lighting of suggested dialogue options based on natu-

ral language input and hints in case of no-match sce-

narios was tested and found to have little effect on

user behaviour and outcome (Lala et al., 2019). How-

ever, the authors ﬁnd that providing open-text input

possibilities increased the interaction with the system.

”PrivaCity”, which provides a mostly text-based in-

teraction based game, reports some of the common

issues that serious games face: user characteristics,

such as game play frequency and game play skills,

may strongly inﬂuence user satisfaction in that fre-

quent players report lower satisfaction scores (Berger

et al., 2019). Nevertheless, the authors report posi-

tive learning outcomes in self-report studies. A fur-

ther proof-of-concept has been delivered by means of

a Wizard of Oz study that ﬁnds a positive impact of

NLIs on learning (Lee et al., 2010). Another prelimi-

nary study cautiously supports the claims of improved

learning and good satisfaction ratings (Neto et al.,

2015) and points out that this approach shows better

learning then presentation-based approaches. How-

ever, in another study comparing reading to a seri-

ous game implementing an intelligent dialogue sys-

tem, the latter showed lower rankings in user reported

learning outcomes and equal rankings in expert eval-

uations (Mori et al., 2013). Finally, a study investi-

gating mixed initiative vs. no agent initiative settings

showed that the former resulted in higher engagement

but there was no overall difference in learning or frus-

tration scores (Wiggins et al., 2019).

5 DISCUSSION

This paper aims to provide insights into the underpin-

nings of natural language interfaces in serious games.

While several works stress the potential of this tech-

nology, a number of aspects needs further investiga-

tion. Below, we ﬁrst aim to point out our ﬁndings in

regard to RQ 1: ”What potentials and drawbacks does

the combination of natural-language-based conversa-

tional interfaces and serious games hold?”. Thus, we

discuss promising aspects of CIs in serious gaming

environments but also to identify what issues need re-

solving. Subsequently, we address RQ 2, ”What as-

pects of natural language-based conversational inter-

faces in serious games need further examination?”, by

identifying areas that have not been sufﬁciently exam-

ined and may be promising for future research.

5.1 Identiﬁed Potentials and Drawback

of Serious Game Chatbots

Below, we aim to summarize ﬁndings and applica-

tions of NLIs in games and point out what poten-

tials are identiﬁed in previous literature. Similar

to Brandtzaeg’s analysis of chatbot user motivations

(Brandtzaeg and Følstad, 2017), reaching from pro-

ductivity or entertainment to social motivations and

curiosity, one might take a closer look on the moti-

vations for serious game designers to include these

systems. As pointed out previously, dialogue is a

promising way to enhance text-based learning with

more interactivity and reduce the stigma of suppos-

edly boring reads (Bellotti et al., 2011). Additionally,

the high amount (n = 16) of implementations in the

form of embodied agents further adds to the poten-

tial gain of natural language based interaction in seri-

ous games. This may be attributed to previous ﬁnd-

ings such as the ’embodied agent effect’ (Atkinson,

2002) which reports increased motivation and learn-

ing through social agents in the form of faces or em-

bodied agents that also add non-verbal cues. These

agents not only improve interaction but support social

elements as well (Augello et al., 2016c). A fact that

is underlined by the various implementations tasking

the agents, or other NLIs, in the above described stud-

ies with training social and communicative skills or

simulating human-human interaction. Thus, this tech-

nique provides a safe environment for professional

training and enables researchers and companies to

simulate otherwise costly human resources in training

and experiments (Forsyth et al., 2015; Augello et al.,

2016b). A further interesting application for research

is demonstrated by works using these techniques for

automated assessment and analysis (Carpenter et al.,

2020; Corrales-Astorgano et al., 2018; Vaassen and

Daelemans, 2010). This is also promising for adap-

tive systems, such as intelligent tutoring systems.

However, our analysis also ﬁnds some drawbacks

and pitfalls when integrating NLIs in serious games.

CSEDU 2021 - 13th International Conference on Computer Supported Education

112

Especially in assessment tasks, the provided feedback

is not always comprehensible for players (Toma et al.,

2018), potentially hindering acceptance. Further-

more, no-match scenarios, in which the system cannot

identify a matching answer, or mismatch scenarios

represent an additional challenge. This is reﬂected in

above discussed ﬁndings (Toncu et al., 2020), where

50% of participants report problems during interac-

tion with the voice-based chatbot that failed to inter-

pret commands correctly or did not follow them as ex-

pected. Previous literature also points out, that users’

initially enthusiastic engagement with bots is often

thwarted quickly if expectations are not met (Piccolo

et al., 2018). Especially considering the many moti-

vations discussed above, ranging from realistic virtual

actors to improved immersion, these scenarios might

interfere with the game’s success - on both the level

of usability and learning outcome. In line with recent

suggestions (Jain et al., 2018), it might be helpful to

clarify in advance what tasks and interactions the CI

is able to handle and what its limits are.

Last, but not least, choices regarding implemen-

tation can only be made with the available resources

in mind: available platforms, provided budget and of

course, if applicable, whether data-driven implemen-

tations can be built on sufﬁciently large data sets. Pre-

vious work mentions, that this approach might cur-

rently not always be feasible due to the lack of avail-

able training data for speciﬁc use cases (Lala et al.,

2019). Thus, applications embedded in a learning

context might have more difﬁculty to train their bots

than commercial or open-domain bots. This might be

addressed by continuous training or narrowing down

the type of questions and answers that the bot will

provide (Gonda et al., 2018).

5.2 Research Gaps and Future

Directions for Research

Especially in the ﬁeld of evaluation many opportu-

nities present themselves to future research. In se-

rious game evaluation, the ultimate goal is to scru-

tinize a game’s suitability and effectiveness regarding

its goals and application setting (Emmerich and Bock-

holt, 2016). This underlines the approach of most of

the above discussed works, that do not evaluate the

NLI separately but take a holistic approach. However,

comparative evaluation of solutions with or without

the interface might provide helpful data to consider

the costs and beneﬁts of developing CIs for serious

games. Generally speaking, as the area of NLP in se-

rious games is still a rather young ﬁeld, spurred on by

the constant advancements of NLP technology, it still

lacks systematic classiﬁcation, categorization, evalu-

ation or general guidelines for future practitioners and

researchers.

5.3 Limitations

Due to the varying level of detail provided in the anal-

ysed body of work, we tailored our overview to a level

that would allow all works to be considered and cat-

egorised. This results in rather high-level categories

in our analysis which may not fully represent data

in more detailed works and does not fully accom-

modate previously introduced evaluation categories

as discussed in section 2. Furthermore, the focus on

conversational interfaces and may not fully consider

all relevant aspects in a complex environment such as

game-based learning and interaction. Motivations to

combine natural language interfaces and game-aspect

might proﬁt from a further inspection of such inter-

faces in entertainment games, which was beyond the

scope of this work.

6 CONCLUSION

The presented paper allows insights into application

of natural language interaction and processing in the

ﬁeld of serious games. The examined body of liter-

ature points out many potentials of this combination,

e.g. in terms of supporting immersion, engagement or

learning outcome. However, this rather young ﬁeld of

research might proﬁt from a more systemic approach

based in the many preceding works regarding conver-

sational interfaces.

ACKNOWLEDGEMENTS

Barbara G

obl is supported by a DOC-Team scholar-

ship of the Austrian Academy of Sciences.

REFERENCES

Adiwardana, D., Luong, M.-T., So, D. R., Hall, J.,

Fiedel, N., Thoppilan, R., Yang, Z., Kulshreshtha,

A., Nemade, G., Lu, Y., et al. (2020). Towards

a human-like open-domain chatbot. arXiv preprint

arXiv:2001.09977.

Atkinson, R. K. (2002). Optimizing learning from exam-

ples using animated pedagogical agents. Journal of

Educational Psychology, 94(2):416–427.

Augello, A., Gentile, M., and Dignum, F. (2016a). So-

cial agents for learning in virtual environments. In

International Conference on Games and Learning Al-

liance, pages 133–143. Springer.

Conversational Interfaces in Serious Games: Identifying Potentials and Future Research Directions based on a Systematic Literature Review

113

Augello, A., Gentile, M., Weideveld, L., and Dignum, F.

(2016b). Dialogues as social practices for serious

games. In Proceedings of the Twenty-second Eu-

ropean Conference on Artiﬁcial Intelligence, pages

1732–1733. IOS Press.

Augello, A., Gentile, M., Weideveld, L., and Dignum, F.

(2016c). A model of a social chatbot. In Intelligent

Interactive Multimedia Systems and Services 2016,

pages 637–647. Springer.

Bellotti, F., Berta, R., De Gloria, A., and Lavagnino, E.

(2011). Towards a conversational agent architecture to

favor knowledge discovery in serious games. In Pro-

ceedings of the 8th International Conference on Ad-

vances in Computer Entertainment Technology, ACE

’11. Association for Computing Machinery.

Bellotti, F., Berta, R., De Gloria, A., Primavera, L., and

Zappi, V. (2008). Travel in europe: An online envi-

ronment to promote cultural heritage. The IPSI BgD

Transactions on Internet Research, 14(1).

Berger, E., Sæthre, T. H., and Divitini, M. (2019). Pri-

vacity. In International Conference on Informatics

in Schools: Situation, Evolution, and Perspectives,

pages 293–304. Springer.

Brandtzaeg, P. B. and Følstad, A. (2017). Why people

use chatbots. In Kompatsiaris, I., Cave, J., Satsiou,

A., Carle, G., Passani, A., Kontopoulos, E., Diplaris,

S., and McMillan, D., editors, Internet Science, pages

377–392, Cham. Springer International Publishing.

Callejas, Z., Ravenet, B., Ochs, M., and Pelachaud, C.

(2014). A model to generate adaptive multimodal job

interviews with a virtual recruiter. Proceedings of the

9th International Conference on Language Resources

and Evaluation, LREC 2014, pages 3615–3619.

Carpenter, D., Geden, M., Rowe, J., Azevedo, R., and

Lester, J. (2020). Automated analysis of mid-

dle school students’ written reﬂections during game-

based learning. In International Conference on Artiﬁ-

cial Intelligence in Education, pages 67–78. Springer.

Chen, H., Liu, X., Yin, D., and Tang, J. (2017). A survey on

dialogue systems: Recent advances and new frontiers.

SIGKDD Explor. Newsl., 19(2):25–35.

Corrales-Astorgano, M., Mart

ınez-Castilla, P., Mancebo,

D. E., Aguilar, L., Ferreras, C. G., and Carde

noso-

Payo, V. (2018). Towards an automatic evaluation of

the prosody of people with down syndrome. In Iber-

SPEECH, pages 112–116.

Crutzen, R., Peters, G.-J. Y., Portugal, S. D., Fisser, E. M.,

and Grolleman, J. J. (2011). An artiﬁcially intelligent

chat agent that answers adolescents’ questions related

to sex, drugs, and alcohol: an exploratory study. Jour-

nal of Adolescent Health, 48(5):514–519.

Daelemans, W., Van den Bosch, A., et al. (2005). Memory-

based language processing. Cambridge University

Press.

Dahlb

ack, N., J

onsson, A., and Ahrenberg, L. (1993). Wiz-

ard of oz studies: Why and how. In Proceedings of

the 1st International Conference on Intelligent User

Interfaces, IUI ’93, page 193–200, New York. ACM.

Economou, D., Doumanis, I., Bouki, V., Pedersen, F.,

Kathrani, P., Mentzelopoulos, M., and Georgalas, N.

(2014). A dynamic role-playing platform for simula-

tions in legal and political education. In 2014 Inter-

national Conference on Interactive Mobile Communi-

cation Technologies and Learning (IMCL2014), pages

232–236. IEEE.

Economou, D., Doumanis, I., Pedersen, F., Kathrani, P.,

Mentzelopoulos, M., and Bouki, V. (2015). Eval-

uation of a dynamic role-playing platform for sim-

ulations based on octalysis gamiﬁcation framework.

In Intelligent Environments (Workshops), pages 388–

395.

Emmerich, K. and Bockholt, M. (2016). Serious Games

Evaluation: Processes, Models, and Concepts, pages

265–283. Springer International Publishing, Cham.

Fitzpatrick, K. K., Darcy, A., and Vierhile, M. (2017). De-

livering cognitive behavior therapy to young adults

with symptoms of depression and anxiety using a fully

automated conversational agent (woebot): a random-

ized controlled trial. JMIR mental health, 4(2):e19.

Følstad, A. and Brandtzæg, P. B. (2017). Chatbots and the

new world of HCI. Interactions, 24(4):38–42.

Følstad, A., Skjuve, M., and Brandtzaeg, P. B. (2018). Dif-

ferent chatbots for different purposes: towards a typol-

ogy of chatbots to understand interaction design. In

International Conference on Internet Science, pages

145–156. Springer.

Forsyth, C. M., Graesser, A., Olney, A. M., Millis, K.,

Walker, B., and Cai, Z. (2015). Moody agents: af-

fect and discourse during learning in a serious game.

In International Conference on Artiﬁcial Intelligence

in Education, pages 135–144. Springer.

Fryer, L. K., Ainley, M., Thompson, A., Gibson, A., and

Sherlock, Z. (2017). Stimulating and sustaining inter-

est in a language course: An experimental comparison

of chatbot and human task partners. Computers in Hu-

man Behavior, 75:461 – 468.

Gonda, D. E., Luo, J., Wong, Y., and Lei, C. (2018). Evalu-

ation of developing educational chatbots based on the

seven principles for good teaching. In 2018 IEEE In-

ternational Conference on Teaching, Assessment, and

Learning for Engineering (TALE), pages 446–453.

Huang, J., Li, Q., Xue, Y., Cheng, T., Xu, S., Jia, J., and

Feng, L. (2015). Teenchat: a chatterbot system for

sensing and releasing adolescents’ stress. In Inter-

national Conference on Health Information Science,

pages 133–145. Springer.

Hussain, S., Sianaki, O. A., and Ababneh, N. (2019). A sur-

vey on conversational agents/chatbots classiﬁcation

and design techniques. In Workshops of the Interna-

tional Conference on Advanced Information Network-

ing and Applications, pages 946–956. Springer.

Jain, M., Kumar, P., Kota, R., and Patel, S. N. (2018). Eval-

uating and informing the design of chatbots. In Pro-

ceedings of the 2018 Designing Interactive Systems

Conference, pages 895–906.

Jepp, P., Fradinho, M., and Pereira, J. M. (2010). An agent

framework for a modular serious game. In 2010 Sec-

ond International Conference on Games and Virtual

Worlds for Serious Applications, pages 19–26. IEEE.

CSEDU 2021 - 13th International Conference on Computer Supported Education

114

Kirkpatrick, D. and Kirkpatrick, J. (2006). Evaluating

training programs: The four levels. Berrett-Koehler

Publishers.

Lala, R., Jeuring, J., and van Geest, M. (2019). Scaf-

folding open text input in a scripted communication

skills learning environment. In International Confer-

ence on Games and Learning Alliance, pages 169–

179. Springer.

Lee, S. Y., Mott, B. W., and Lester, J. C. (2010). Optimiz-

ing story-based learning: An investigation of student

narrative proﬁles. In International conference on in-

telligent tutoring systems, pages 155–165. Springer.

Maroengsit, W., Piyakulpinyo, T., Phonyiam, K.,

Pongnumkul, S., Chaovalit, P., and Theeramunkong,

T. (2019). A survey on evaluation methods for

chatbots. In Proceedings of the 2019 7th Interna-

tional Conference on Information and Education

Technology, ICIET 2019, page 111–119, New York.

ACM.

Mori, D., Berta, R., De Gloria, A., Fiore, V., and Magnani,

L. (2013). An easy to author dialogue management

system for serious games. Journal on Computing and

Cultural Heritage, 6(2):1–15.

Neto, J., Ribeiro, C., Pereira, J., and Neto, M. J. (2015). Vir-

tual agents and multi-modality of interaction in multi-

media applications for cultural heritage. In Proceed-

ings of the 10th International Conference on Com-

puter Graphics Theory and Applications., pages 446–

453. SCITEPRESS-Science and Technology Publica-

tions.

Othlinghaus, J. and Hoppe, H. U. (2016). Supporting group

reﬂection in a virtual role-playing environment. In In-

ternational Conference on Intelligent Technologies for

Interactive Entertainment, pages 292–298. Springer.

Panzoli, D., Qureshi, A., Dunwell, I., Petridis, P., de Fre-

itas, S., and Rebolledo-Mendez, G. (2010). Levels

of interaction (loi): a model for scaffolding learner

engagement in an immersive environment. In Inter-

national Conference on Intelligent Tutoring Systems,

pages 393–395. Springer.

Peras, D. (2018). Chatbot evaluation metrics. Economic

and Social Development: Book of Proceedings, pages

89–97.

Piccolo, L. S., Mensio, M., and Alani, H. (2018). Chasing

the chatbots. In International Conference on Internet

Science, pages 157–169. Springer.

Powers, D. M., Leibbrandt, R., Pﬁtzner, D., Luerssen, M.,

Lewis, T., Abrahamyan, A., and Stevens, K. (2008).

Language teaching in a mixed reality games environ-

ment. In Proceedings of the 1st international confer-

ence on Pervasive Technologies Related to Assistive

Environments, pages 1–7, New York. ACM.

Ren, R., Castro, J. W., Acu

na, S. T., and de Lara, J.

(2019). Evaluation techniques for chatbot usability:

A systematic mapping study. International Journal

of Software Engineering and Knowledge Engineering,

29(11n12):1673–1702.

Rojas-Barahona, L. M. and Cerisara, C. (2014). Bayesian

inverse reinforcement learning for modeling conver-

sational agents in a virtual environment. In Interna-

tional Conference on Intelligent Text Processing and

Computational Linguistics, pages 503–514. Springer.

Sermet, Y. and Demir, I. (2019). Flood action VR: a virtual

reality framework for disaster awareness and emer-

gency response training. In ACM SIGGRAPH 2019

Posters, pages 1–2.

Shevat, A. (2017). Designing bots: Creating conversational

experiences. O’Reilly Media, Inc.

Smutny, P. and Schreiberova, P. (2020). Chatbots for learn-

ing: A review of educational chatbots for the facebook

messenger. Computers & Education, page 103862.

Stefanidi, E., Arampatzis, D., Leonidis, A., Korozi, M., An-

tona, M., and Papagiannakis, G. (2020). Magiplay:

An augmented reality serious game allowing children

to program intelligent environments. In Transactions

on Computational Science XXXVII, pages 144–169.

Springer.

Taouil, M., Begdouri, A., and Majda, A. (2018). Adap-

tive dialogue system for disabled learners: Towards a

learning disabilities model. In 2018 IEEE 5th Interna-

tional Congress on Information Science and Technol-

ogy (CiSt), pages 422–427. IEEE.

Tironi, A., Mainetti, R., Pezzera, M., and Borghese, N. A.

(2019). An empathic virtual caregiver for assistance

in exer-game-based rehabilitation therapies. In 2019

IEEE 7th International Conference on Serious Games

and Applications for Health (SeGAH), pages 1–6.

IEEE.

Toma, I., Bacioiu, F., Dascalu, M., and Trausan-Matu, S.

(2018). The edutainment platform: Interactive sto-

rytelling relying on semantic similarity. In Chal-

lenges and Solutions in Smart Learning, pages 87–96.

Springer.

Toncu, S., Toma, I., Dascalu, M., and Trausan-Matu, S.

(2020). Escape from dungeon—modeling user inten-

tions with natural language processing techniques. In

Ludic, Co-design and Tools Supporting Smart Learn-

ing Ecosystems and Smart Education, pages 91–103.

Springer.

Torrente, J., Marchiori, E. J., Vallejo-Pinto, J.

A., Ortega-

Moral, M., Moreno-Ger, P., and Fern

andez-Manj

on,

B. (2012). Eyes-free interfaces for educational games.

In 2012 International Symposium on Computers in

Education (SIIE), pages 1–6. IEEE.

Vaassen, F. and Daelemans, W. (2010). Emotion classiﬁ-

cation in a serious game for training communication

skills. LOT Occasional Series, 16:155–168.

Wiggins, J. B., Kulkarni, M., Min, W., Boyer, K. E., Mott,

B., Wiebe, E., and Lester, J. (2019). Take the initia-

tive: Mixed initiative dialogue policies for pedagog-

ical agents in game-based learning environments. In

International Conference on Artiﬁcial Intelligence in

Education, pages 314–318. Springer.

Zadrozny, W., Budzikowska, M., Chai, J., Kambhatla, N.,

Levesque, S., and Nicolov, N. (2000). Natural Lan-

guage Dialogue for Personalized Interaction. Com-

mun. ACM, 43(8):116–120.

Conversational Interfaces in Serious Games: Identifying Potentials and Future Research Directions based on a Systematic Literature Review

115