Using VR Games in Learning Spoken English: A VR Instructional
Design
Lingfang Li
1,*,† a
, Wenqi Li
2,† b
and Leyi Wang
3,† c
1
Department of Computer Science, University of Liverpool, Liverpool, Mersyside, U.K.
2
School of Foreign Languages and Literature, Beijing Normal University, Beijing, China
3
College of Mount Saint Vincent, 6301 Riverdale Avenue, The Bronx, New York, U.S.A.
All these authors are contributed equally
Keywords: Oral English Education, Virtual Reality (VR), Game Teaching.
Abstract: In recent years, with the continuous development of international exchanges, fluency in spoken English has
become the need and goal of more and more Chinese students. However, domestic classroom teaching
cannot meet such needs. In order to solve the problem that oral English is difficult to learn and challenging
to practice, this paper designs a spoken English learning system based on VR technology. Creating a
language environment with rich interaction not only stimulates and maintains the interest of learners but
also allows players to interact with personalized guiding characters. The character is dubbed by native
English speakers, assisted by various cultural environments, to create a very close-to-real speaking practice
environment for learners. Another feature of this system is that players can improve their English speaking
and communicative skills in the game, and learners of different levels can improve from the communication
that suits them. It is hoped that this system can provide some reference and inspiration for the new oral
English teaching in the future. It is believed that with the support of VR technology, there will be a broader
picture for oral English learning in the future.
a
https://orcid.org/0000-0001-6680-3197
b
https://orcid.org/0000-0001-6364-7846
c
https://orcid.org/0000-0003-3787-867X
1 INTRODUCTION
With the development of international
communication, English has become a global
language worldwide. Then increasingly frequent
English speaking between people from different
regions facilitates the development of many fields.
China has the world's highest number of English
learners, but there is a general problem with English
education in China. In particular, there are problems
in teaching spoken English. To enhance the quality
of spoken English education, a variety of methods
for studying spoken English need to be used.
According to the Common European Framework
of Reference for Languages (CEF), most Chinese
learners have a proficiency in English between A1
and B1, which means they have a basic command of
it. Nevertheless, for the majority of Chinese
students, research shows that their English fluency is
far below the requirements of their English
textbooks (Cortazzi & Jin, 1996). Because of the
examination-oriented education system, many
teachers take the English test scores as the only
criterion for evaluating students' English
proficiency, resulting in students' utilitarian attitude
toward oral English learning, and it is not easy to
fully develop students' skills in foreign language
learning. As a result, students' oral English
communication skills are low. Therefore, how to
adapt to the requirements of English communication
in the new situation and improve Chinese students'
oral English communication ability is an urgent
problem to be solved.
In addition, most students feel anxious about
learning English in classroom situations because
they are afraid of making mistakes. Jakobovits
(1971) found that the factors that affect language
learning are divided by scale as follows: ability,
414
Li, L., Li, W. and Wang, L.
Using VR Games in Learning Spoken English: A VR Instructional Design.
DOI: 10.5220/0011913000003613
In Proceedings of the 2nd International Conference on New Media Development and Modernized Education (NMDME 2022), pages 414-420
ISBN: 978-989-758-630-9
Copyright
c
2023 by SCITEPRESS Science and Technology Publications, Lda. Under CC license (CC BY-NC-ND 4.0)
emotion, intelligence, and others. The ability is
related to learners' motivation and confidence in
learning a second language. Students with strong
foreign language learning motivation and high
confidence can get better results. Due to the pressure
and anxiety of foreign language learning in
classroom situations, there is a low impact on
students' English learning. In addition, interaction
will have a slow but stable promotion effect on
language learning. By simulating real situations in
real life through learning activities, it is possible to
learn in an integrated way. Learning a foreign
language in real-life settings also helps students
build cultural ability. However, with the popularity
of computers and Virtual Reality (VR) technology,
the way of English learning has gradually expanded
from textbooks and classroom teaching to computer-
assisted language learning (CALL) (Schwienhorst,
1998).
In recent years, several emerging pedagogical
methods and techniques can be used to improve the
situations mentioned above. According to previous
literature research, plenty of articles have proposed
many approaches to learning oral English
communication skills (OECS). Figure 1 shows nine
main teaching techniques of spoken. Many works
have investigated that mobile, social media, and
virtual language communities can engage oral
communicative skills by providing record and
review functions and exposing learners to real
discussions and speaking to native speakers,
respectively (Hart,2016; Murphy, Farley, Dyson &
Jones, 2015; Smith, Hickmott, Bille, Burd,
Southgate & Stephens, 2015; Wiemeyer & Zeaiter,
2015). Therefore, it is essential to utilize modern
technology to assist English learners in studying
spoken English because current teaching in
classroom settings cannot get satisfactory results.
Figure 1: Different teaching Methods.
Facing the dilemma of practicing oral English in
China, this project has been thinking a lot about
what new technologies might help with that. After
several searches, this project turned attention to
Virtual Reality. As the name means, VR is an
illusory reality in a virtual, software-based world. It
makes the users feel they are experiencing a specific
experience using special perceptual change tools.
Using special hand-held tools or particular floors,
users can feel as if they are really walking around
the virtual environment and interacting with virtual
objects. As a new technology, VR has been applied
in many teaching fields. In physics, for example, A
virtual reality physics simulation enhances students'
understanding by providing a degree of reality
unattainable in a Traditional two-dimensional
interface, creating a sensory-rich interactive learning
Environment (Kim et al., 2001).
VR can be a suitable technology for the current
difficulties in oral language teaching. While creating
a language environment in the real world is
complicated, creating an environment in the virtual
world can be the solution. VR solves the biggest
problem of spoken English: the right language
environment. The design of a VR world can help the
students who want to practice oral English in an all-
English environment, for example, by designing
virtual characters that speak English to communicate
with users. After solving the language environment,
the project focus on the second problem -- how to
arouse interest of students. In this VR world, having
conversations with virtual characters would be
boring. Nowadays, most students have great fun
Using VR Games in Learning Spoken English: A VR Instructional Design
415
with games. It must be a brilliant idea to combine
VR and gaming. Virtual characters can have
different conversations with users depending on the
tasks and the outcome of the tasks, allowing users to
have different spoken experiences in different tasks
in the game. VR games can arouse the interest of
students in learning and grade speaking level of
students through the task score in the game. This
grading system can help students have the most
suitable environment for their oral practice. In the
following paragraphs, this article will analyze our
VR game project from three aspects: an appropriate
language environment, attracting people to learn
with fun, and managing the grading system.
2 A DESIGN OF VR GAME IN
ORAL ENGLISH EDUCATION
2.1 An Appropriate Language
Environment
This paragraph aims to indicate an appropriate
language environment for VR game by using native
speakers who are non-player characters NPCto
standardize spoken English, extend the culture
behind English, and provide some relative
configurations. When Chinese students learn to
speak, they need to learn phonetics, rhythm, accent,
and fix grammatical errors in their spoken language.
A study considered how distance language learning
affects intermediate learners of Italian to negotiate
with web-based native speakers of Italian (Tudini,
2003). According to Tudini (2003), the chat
transcripts indicate that learners did negotiate for
meaning and modified their language when engaged
in open-ended conversational tasks with unfamiliar
interlocutors, with lexical and structural difficulties
triggering most of the negotiations. That is to say,
English can also be like Italian, through the native
speaker and learning English people to
communicate, English learners can unconsciously
become aware of some grammatical errors and then
correct, but also accumulate some authentic spoken
phrases and vocabulary and even sentences, to use in
the subsequent communication of the game, and can
also be used in real oral communication, these things
learned in this program in the Chinese spoon-feed is
very difficult to learn, even if they can accumulate
some authentic usage, but also with specific
difficulties to apply to live. Thus, by communicating
with native speakers in the game, Chinese students
can improve their grammar and pronunciation skills
to some extent, and learn and even use authentic
spoken language.
Despite the focus on pronunciation and grammar
as well as authentic spoken usage, it appears that
understanding the culture behind the language and
having a stronger motivation to learn it can promote
the study of spoken English while speaking with
native speakers. One study showed that native
speaker teachers of English (NTs) spoke more
fluently and were more representative of the target
culture, while non-native speaker teachers of English
(NNTs) showed better unambiguous grammatical
knowledge (Andrews, 2007). NTs are capable of
better representing their culture by accurately
representing it with more examples so that people
understand the reasons behind the culture and can
better determine what their authentic culture is and
what is just misconceptions and rumors.
Moreover, most participants felt that NTs
provided a better approach to teaching language
learning strategies, provided more information about
the English language, and could better predict and
prevent difficulties for students (Gurkan & Yuksel,
2012). All these make it easier for learners to learn
English, which naturally reduces their inner anxiety
and increases their interest in learning. Therefore,
they believe they can learn well and are interested in
learning to speak, enhancing their motivation to
learn.
In addition, one article indicates that it is more
motivating for all students to speak the target
language since they cannot speak their native
language in the class since they must speak the
language NTs are saying (Madrid & Perez, 2004).
Many NTs can only speak in their language, and
when students are in a situation where they have no
choice but to try to learn English well, they cannot
be lazy to learn another language through their
mother tongue, so their motivation to learn can
increase. Overall, adding NPCs spoken native
language to the game is the right choice, as they can
also motivate students and provide a more
immersive cultural exchange experience, just like
NTs can.
After introducing why this game should use
native NPC, the next section will focus on functions
regarding the native speaker aspect of the game.
First, professional native English dubbers are invited
to dub the NPCs. The English dubbers are able to
teach students proper English pronunciation and
give them a sense of the elegance and necessity of
authentic pronunciation, which increases their
motivation to learn to pronounce correctly.
Moreover, the dubbers who can pronounce
NMDME 2022 - The International Conference on New Media Development and Modernized Education
416
American or British are selected to do the dubbing.
Because American and British pronunciations are
the two most dominant in China, for example, they
are the only two pronunciations used in English
listening in the Chinese college entrance
examination. In addition, involving native English
speakers in the content of the dialogue of the game
as well as in the setting of the scenes makes the
dialogue more authentic and daily. As mentioned
earlier, NTs can more easily identify the difficulties
of students and communicate their cultural messages
more accurately so that they can design activities
that are more targeted to address the learning
difficulties of learners and add scenarios that better
demonstrate English culture.
2.2 Attracting People to Learn with
Fun
In this oral language learning tool based on VR
games, making full use of the characteristics of VR
games to stimulate students' learning enthusiasm is
very important. The beginning of the game is the
most challenging tutorial design. The game should
not start out confusing or annoying the player. In
most games, the arrow is the most commonly used
tool, and the arrow displayed on the screen tells the
player which button to click next. VR games are
highly interactive and realistic, and only arrow
guidance may reduce the fun and visual effects of
the game. Virtual guiding characters are a good
choice. For the virtual guiding character, his/her
image needs to appeal to all users in order to make
the experience of teaching the game fun. Different
preferences for roles can be chosen. People have
different combinations of preferences for gender,
body type, skin colour, and sound. For example,
some people like a muscular man with a baby voice,
others like a cute girl with a sweet voice. It is hard to
design characters that everyone likes. The ideas of
game modelers are limited; instead, the inspiration
and ideas of users are infinite. Only users know what
they like best. Users can design their virtual guiding
characters. People are more enthusiastic about the
image they like. This kind of design makes the
player feel involved from the start and more patient
as they go through the tutorial.
VR games as an assistant for practicing oral
English are different from other ordinary story
games when it comes to completing story tasks.
Unlike other narrative games where learners click on
the appropriate dialogue to proceed to the next round
of story, in this VR game, the users have to read out
the options they have chosen. The VR game has an
accompanying radio system that can be mounted on
a VR eye mask. The design of the story is essential
in this VR game. First of all, correct grammar and
common life expressions need to be paid attention
to. Secondly, the story is interesting enough to help
students learn faster. A boring story is like watching
a TV show that drags on, which makes the player
bored and lose the desire to do story quests. The
more exciting stories are more likely to be
remembered by the students. "Attentional demands
and recall for stories that differed in rated interest
were examined. The More interesting stories
required fewer attentional resources for
Comprehension than did less Interesting Stories
"(McDaniel et al., 1999). A compelling story allows
the player to remember more with less experience
than a boring story. In order to make a breakthrough
in plot design, we plan to hire a number of fantasy
writers to work together on the main story and side
story design.
As the story progresses, the player will get to
know more fictional characters related to the story.
In order to increase the player's immersion in the
game, the game will have a virtual character liking
system. Different choices in the story will result in a
decrease or increase in the favorability of the
relevant avatar. Different levels of dialogue will be
unlocked according to the player's favorability level
with the avatar, allowing players to practice more
styles of dialogue. "Cultural elements were selected
for study based on their comparable importance in
the home culture of the authors. Cultural artifacts,
the more visible elements of culture, were studied at
the exclusion of cultural values. With the advent of
the functional and communicative proficiency
approaches in the 1970s, and all through the 1980s,
teachers moved away from relying solely on
textbooks to teach language"(Paige et al., 2003).
Many well-known professors have endorsed the
influence of culture on language learning. Language
teaching should not only use textbooks but should
combine with different cultural backgrounds. People
in different cultures speak in different ways. For
example, in the same English language, the way
people spoke in the Victorian period in England is
undoubtedly different from how they spoke in the
1980s in the United States when hip-hop culture was
prevalent. In this VR game, cultural elements are put
into the dialogue. In the dialogue with different
favourability levels, the background of the
corresponding virtual character can be combined to
make the player's speaking practice scope not
limited to the game story. For example, when a
player becomes friends with an avatar from the
Using VR Games in Learning Spoken English: A VR Instructional Design
417
United States, the avatar will talk to the player about
American hip-hop culture.
2.3 Managing the Grading System
This article designs a VR learning system for
Chinese English learners, which is localized
according to the different English learning statuses
and requirements of various groups of learners, so it
can better meet Chinese learners’ needs. In addition,
this system adds NPCs selected and created by users
themselves, and there is sufficient voice-driven
interaction between characters and users, which
ensures the adequacy of interaction and the fun and
effectiveness of spoken language learning (Sha,
2009).
Based on former studies, a personalized English
learning system can significantly enhance learner
English abilities and promotes learning interests
(Chen & Chung, 2007). Therefore, this paper
designs a spoken language training system for
Chinese English learners in a VR environment. The
system has high flexibility, which is convenient for
learners to customize according to their needs. In
this game system, each level is scored on the player's
spoken English level, and the player's cumulative
score corresponds to three levels of spoken English:
Beginner, Intermediate, and Advanced. Players can
also choose to test their oral English proficiency
before logging into the game system or choose a
learning scene according to their actual needs to
practice English speaking and communicative skills
as well as prepare for later application in real life.
This game system allows learners to have natural
conversational interactions with NPCs to improve
their spoken language in a variety of fun games. The
system will automatically record the learner's
learning habits and progress and track their learning.
If learners always have access to a large amount of
English learning content that is slightly higher than
the learner's current English level, that is, the "i+1"
level of English learning content according to
Krashen (1985), then the learner's English
acquisition process will get a very effective
promotion. At different levels of the game, the
characters use different vocabulary, sentence
patterns, and speech rates. As the level of games
increases, the players continue to improve their
speaking, and the system will also gradually increase
the difficulty of the game according to the
completion of each mission. At a different level,
NPCs will provide conversations in various
scenarios and learners interact with the character at
the corresponding level according to their capability.
Besides, if players pass the test in the Beginner,
Intermediate, and Advanced levels, random
scenarios will be unlocked in which players actually
apply spoken language, such as interviews,
speeches, ordering dishes, partying, Etc. Using the
spoken language that players practiced in the
previous sections in practice will help them gain
close-to-practical experience. Players pass the test in
the Beginner, Intermediate, and Advanced levels,
and random scenarios will be unlocked in which
players actually apply spoken language, such as
interviews, speeches, ordering dishes, partying, and
so on. Using the spoken language that players
practiced in the previous sections in practice will
help them gain close-to-practical experience.
Advanced learners will unlock some actual NPCs
who can have real-time conversations and other
more demanding tasks with them.
If provided with adaptive subject materials with
adaptive presentation styles, learners will improve
both their learning achievements and learning
efficiency (Tseng & Chu & Hwang & Tsai, 2007).
This is the reason why this paper designs its special
conversational mode. After the conversation of each
level starts, the NPC will guide the learner to
respond. When the learner's response audio is
transmitted to the system through the microphone,
the system first recognizes what the user said and
then performs corpus matching. How the
conversation that follows will unfold depends on
what the user says. At the same time, the system will
evaluate the user's pronunciation, wording, and
fluency and comprehensively determine how the
NPC needs to respond to the learner. According to
this mode, the dialogue continues until all tasks in
this level are completed. In order to facilitate the
training of spoken English and comprehensively
evaluate the learners' spoken language, the
pronunciation evaluation system in the game
includes three scoring standards: pronunciation,
wording, and fluency. After the whole level is
completed, the system will automatically calculate
the average score and total score of all assessments
in the level, and then display the scoring details and
total score to the learner, and record the score as the
learner's personalized learning information to track
learning progress. At the Beginner level, the scoring
standard of pronunciation is different from that of
the Intermediate and Advanced levels. Since the
primary goal of the system is to improve the oral
communication ability and enthusiasm of Chinese
oral English learners, it is not required that all
learners have a pronunciation that is close to native
English speakers at the Beginner level. It only
NMDME 2022 - The International Conference on New Media Development and Modernized Education
418
requires learners to pronounce clearly and correctly
without affecting communication. However, in the
Intermediate and Advanced levels, especially the
latter, the more proper and authentic the player's
pronunciation is, the higher the score of
pronunciation the learner will get. Interest and
motivation matter in learning so it is of great
importance not to affect it by high standards from
the beginning (Harlen & Crick, 2003).
Besides, all learners and their scores in the
system are synchronized, and each learner can also
see the accumulated scores and learning results of
their peer learners. This system creates a virtual
environment for multiple learners where they can
communicate in real time through speech, text and
body movements. Learners can also talk, share
information, watch videos and so on at the same
time, which will improve the interaction between
them.
3 CONCLUSIONS
To sum up, VR games are highly compatible with
oral English teaching. VR games can have
corresponding solutions regarding language
environment, fun, and graded training required for
spoken English. The creativity and authenticity of
VR games allow users to have a language
environment outside of Chinese. By employing
native speakers with American and British accents to
participate in character dubbing, users can get a
standard language environment in VR games.
Having native Speakers participate in the character's
speaking design also makes the speaking in the
game more relevant to life. In terms of attracting
users' interest, the virtual guide characters that can
be designed by themselves, the exciting story of the
game, and the favorability system of the virtual
characters not only open a variety of game
experience modes for users, but also stimulate the
impetus of users to continue to play the game.
Personalized graded training is very carefully
designed. Depending on how the user responds to
different levels of the game, the system will
recognize whether the user's accent is accurate or
not. According to the user's expression score in the
aspect of accent, the oral language level of the user
is graded. After the user reaches a certain level of
spoken English, the user can also open the
communication and interaction of the natural
person's virtual character. Integrating innovation into
education is a core concept of this project. This
project is also based on our team's observation of the
shortcomings of current English education. As a new
technology, VR will be integrated into people's lives
with continuous development and research.
Moreover, there are some limitations of this
study. This article summarizes the author's analysis
of the literature to propose a new way of learning
spoken English and is not equipped to conduct
experiments to prove how feasible the new ideas are.
Although all the evidence and methods are
objective, few conclusions are based on subjective
deduction. Future studies are suggested to carry out
pilot tests to verify the validity of this idea. Hoping
that the design of this project can be enlightening to
the current oral English education so that more
people can have a better way of learning oral
English.
REFERENCES
Andrews, S. (2007). Teacher Language Awareness
Cambridge University Press.
Chen, C. M., & Chung, C. J. (2008). Personalized mobile
english vocabulary learning system based on item
response theory and learning memory cycle.
Computers & Education, 51(2), 624-645.
Gurkan, S., & Yuksel, D. (2012). Evaluating the
contributions of native and non-native teachers to an
English Language Teaching program. Procedia-Social
and Behavioral Sciences, 46, 2951-2958.
Hart, T. (2016). Learning How to Speak Like a “Native”
Speech and Culture in an Online Communication
Training Program. Journal of Business and Technical
Communication, 30(3), 285-321.
Kim, J., Park, S., Lee, H., Yuk, K., & Lee, H. (2001).
Virtual reality simulations in physics
education. Interactive Multimedia Electronic Journal
of Computer-Enhanced Learning, 3(2), 1-7.
Krashen, S. D. (1985). The input hypothesis: issues and
implications. Language.
Madrid o, M. (2004). Evaluation. In D. Madrid & N.
McLaren, (Eds.), TEFL in primary education (pp.
441480). Granada: Editorial Universidad de Granada.
Martin, Cortazzi, Lixian, & Jin. (1996). English teaching
and learning in china. Language Teaching, 29(2), 61-
80.
McDaniel, M., Waddill, P. J., Finstad, K., & Bourg, T.
(2000). The effects of text-based interest on attention
and recall. Journal of educational Psychology, 92(3),
492.
Murphy, A., Farley, H., Dyson, L., & Jones, H.
(2017). Mobile learning in higher education in the
Asia-Pacific region. Singapore: Springer.
Paige, R., Jorstad, H., Siaya, L., Klein, F., Colby, J.,
Lange, D., & Paige, R. (2003). Culture learning in
language education. Culture as the core: Perspectives
on culture in second language learning, 173-236.
Using VR Games in Learning Spoken English: A VR Instructional Design
419
Schwienhorst, K. (1998). The 'third place' virtual reality
applications for second language learning. Recall,
10(1), 118-126.
Sheppard, R. B. D. C. (1971). Foreign language learning:
a psycholinguistic analysis of the issues by leon a.
jakobovits. The Modern Language Journal.
Smith, S., Hickmott, D., Bille, R., Burd, E., Southgate, E.,
& tephens, L. (2015, December). Improving
undergraduate soft skills using m-learning and serious
games. In 2015 IEEE international conference on
teaching, assessment, and learning for engineering
(TALE) (pp. 230-235). IEEE.
Tseng, J., Chu, H., Hwang, G., & Tsai, C. (2008).
Development of an adaptive learning system with two
sources of personalization information. Computers &
Education, 51(2), 776-786.
Tudini, V. (2003). Using native speakers in
chat. Language Learning & Technology, 7(3), 141-
159.
Wiemeyer, L., & Zeaiter, S. (2015). Social media in EFL
teaching: Promoting (oral) communication skills in
complex competency tasks. Dutch Journal of Applied
Linguistics, 4(2), 193-211.
Wynne, Harlen, Ruth, Deakin, & Crick. (2010). Testing
and motivation for learning. Assessment in Education:
Principles, Policy & Practice, 10(2), 169-207.
NMDME 2022 - The International Conference on New Media Development and Modernized Education
420