Emotionalyzer: Player’s Facial Emotion Recognition ML Model for

Video Game Testing Automation

Rebeca Bravo-Navarro, Luis Pineda-Knox and Willy Ugarte

Universidad Peruana de Ciencias Aplicadas (UPC), Lima, Peru

Keywords:

Sentiment Analysis, Human-Computer Interaction, Player Testing, Gameplay Experience Testing, Facial

Emotion Recognition.

Abstract:

In video game development, the play testing phase is crucial for evaluating and optimizing user perception

before launch. These tests are often costly and require signiﬁcant time investment, as they are conducted

by experts observing gameplay sessions, which makes capturing real-time data, such as facial and bodily

expressions, challenging. Additionally, many independent studies lack the necessary resources to conduct

professional testing. Therefore, smaller developers need more cost-effective and time-efﬁcient alternatives to

improve their products and streamline the development process. This project aims to develop a real-time facial

emotion recognition model using machine learning, which will be integrated into an application that records

the player’s emotions during the gameplay session. It seeks to beneﬁt Peruvian indie companies by reducing

costs and time associated with traditional testing and providing a more precise and detailed evaluation of the

user experience. Additionally, the use of machine learning technology ensures continuous adaptation and

progressive improvements in the model over time.

1 INTRODUCTION

In the realm of video game development, the pro-

cess of user experience testing holds signiﬁcant im-

portance prior to a game’s launch. This step is critical

for evaluating and reﬁning the game from the perspec-

tive of the player (Dumas and Redish, 1993).

Major players in the video game industry invest

substantial resources into this phase, ensuring that

games are released only when they meet a certain

level of quality and completeness (El-Nasr et al.,

2013).

However, conducting user testing can be both

costly and time-consuming, often requiring the pres-

ence of an expert to observe gameplay sessions. One

notable challenge of this approach is its limited abil-

ity to capture spontaneous reactions, such as facial ex-

pressions and body language during gameplay.

Typically, player feedback is gathered through

post-game interviews and surveys

, although these

methods may introduce biases due to memory and

https://orcid.org/0000-0002-7510-618X

Live-game satisfaction survey: https://www.hoyolab.

com/article/3523425

Closed alpha survey announcement: https://twitter.

com/KAGESMG/status/1784255738174468196

self-reporting tendencies.

Turning our attention to Peru, the local chapter

of the International Game Developers Association

(IGDA)

reports over 50 registered companies en-

gaged in video game development, alongside numer-

ous independent developer teams, with a focus on in-

die games.

Despite the growing presence of the industry,

the videogame landscape in Peru remains relatively

young, with most companies emerging after the year

2000.

Nevertheless, due to resource constraints,

few, if any, independent developers can conduct

professional-level user testing. Notable examples of

this situation include Colorful by Peruvian developer

Hitoshi Kanno, which remains in development ten

years after its initial conceptualization

, and Peru-

vian studio Bamtang Games, which only recently

implemented its own Quality Assurance department

despite developing games for over 20 years

IGDA Peru website: https://igda.pe/

Colorful website: https://www.facebook.com/

ColorfulTheGame

Jesus Blas’s talk on ’Cambios en el desarrollo de

videojuegos’: https://www.facebook.com/igdaperu/videos/

250518101166436

Bravo-Navarro, R., Pineda-Knox, L. and Ugarte, W.

Emotionalyzer: Player’s Facial Emotion Recognition ML Model for Video Game Testing Automation.

DOI: 10.5220/0013439400003929

In Proceedings of the 27th International Conference on Enterprise Information Systems (ICEIS 2025) - Volume 1, pages 937-943

ISBN: 978-989-758-749-8; ISSN: 2184-4992

937

This underscores the need for more cost-effective

and time-efﬁcient alternatives that not only enhance

the quality of their products but also streamline the

development process, particularly during the design

phase.

Our proposed solution entails the development of

an autonomous learning model capable of recogniz-

ing user emotions through facial recognition technol-

ogy.

This model would be implemented and tested us-

ing a desktop application that runs during gameplay

sessions, capturing user inputs to generate an emotion

report upon completion. Our work primarily targets

independent companies within Peru’s video game de-

velopment sector.

Given the growing nature of this industry in the

country, many such companies operate with modest

budgets and development teams, typically classiﬁed

as small or medium-sized enterprises.

Our scope encompasses the development of a fa-

cial emotion recognition (FER) model tailored for use

in video game testing by independent entities, along-

side data analysis to predict emotions in single-player

experiences. Additionally, the model will be imple-

mented through a desktop application.

Altough, it is important to note that the project

excludes testing multiplayer experiences, emotion

recognition through voice or bodily movements, and

implementation for mobile devices or consoles.

Despite its promising potential, our proposal faces

several constraints. Limited ﬁnancial resources may

hinder its execution, necessitating careful budgetary

planning. Additionally, potential shortages in human

resources, such as experience and skill sets, could

pose implementation challenges.

Moreover, legal and ethical considerations related

to the collection and processing of facial data may im-

pact project development.

Lastly, a critical limitation is the availability of a

diverse range of facial expressions in the training data,

which is crucial for reﬁning the facial emotion recog-

nition model.

Ultimately, our project aims to serve as an inno-

vative and accessible solution to enhance user testing

within Peru’s independent video game development

sector.

By addressing prevailing limitations in cost and

accuracy associated with emotional data collection,

our initiative seeks to empower independent develop-

ers to compete on the global stage with products of

superior quality.

This paper is distributed in the following sections:

First, we review the related works of Facial Emotion

Recognition (FER) in Section 2. Then, we talk about

relevant concept and theory related to the background

of our research and describe in more detail our main

contribution in Section 3. Furthermore, we will ex-

plain procedures performed and the experiments that

were carried out in this work in Section 4. In the

end, we will show the main conclusions of the project

and we will indicate some recommendations for fu-

ture work in Section 5.

2 RELATED WORK

Regarding our issue, we have reviewed a wide range

of documents and previous research on emotion

recognition through machine learning, as well as a

limited number of similar applications of this tech-

nology in the realm of video games.

However, unfortunately, we have not identiﬁed so-

lutions comparable to our proposed project. Below,

we will present the most relevant works related to our

proposal.

2.1 Towards Personalised Gaming via

Facial Expression Recognition

In (Blom et al., 2014), the authors address the issue

of personalizing the gaming experience through real-

time emotion recognition.

The growing signiﬁcance of AI in gaming under-

scores the need for tailored experiences. Researchers

proposed a technique based on modifying game levels

guided by player expressions, leveraging computer

vision.

Using IN-SIGHT SDK for facial expression

recognition, they tracked emotions and mapped them

to gameplay challenges in Inﬁnite Mario Bros.

Results showed effective adaptation of difﬁculty

levels based on player emotions, with user preference

for this dynamic approach over static level systems.

Both (Blom et al., 2014) and our proposal cen-

ter around the enhancement of gaming experiences

through the implementation of emotion recognition

techniques during gameplay.

They exhibit commonalities in their recognition of

the pivotal role of personalization, utilization of facial

recognition technologies, and acknowledgment of the

crucial signiﬁcance of advanced technologies such as

computer vision and machine learning.

ICEIS 2025 - 27th International Conference on Enterprise Information Systems

938

2.2 Facial Emotion Recognition Using

Deep Learning Detector and

Classiﬁer

In (Kit et al., 2023), the authors focus the issue of

facial expression recognition, highlighting the impor-

tance of non-verbal elements in human communica-

tion.

A deep learning-based system is proposed, utiliz-

ing the MobileNetv-1 model to predict emotions in

video sequences, prioritizing speed and accuracy.

Training datasets are prepared in both color and

grayscale, followed by model training and evaluation.

It is concluded that grayscale facial recognition

achieves an accuracy of 86.42%, surpassing color

recognition due to the inﬂuence of lighting on color

variation.

Facial alignment and image color space signiﬁ-

cantly affect the accuracy and computational cost of

facial emotion recognition.

2.3 A Systematic Review on Affective

Computing: Emotion Models,

Databases, and Recent Advances

In (Wang et al., 2022), the authors address the chal-

lenge of emotional recognition and sentiment analy-

sis through physical and physiological data, aiming to

enhance human-computer interaction.

Over 380 studies were reviewed, categorizing af-

fective computing into unimodal recognition and mul-

timodal analysis. Models based on physical and

physiological information were examined, conclud-

ing with a comprehensive analysis of model efﬁcacy.

The fusion of physical and physiological data en-

ables the extraction of useful features to improve af-

fective computing models.

A systematic review of emotion models,

databases, and recent advances is presented, in-

tending to guide academic and industrial researchers

towards promising new directions in this rapidly

advancing ﬁeld.

Within (Wang et al., 2022) , the authors conducts

a thorough review of existing research and analyzes

the efﬁcacy of emotional recognition models within

the realm of human-computer interaction.

In contrast, our proposal outlines a plan to develop

a speciﬁc emotional recognition model tailored for

gaming experiences.

Furthermore, they address a broader context

of human-computer interaction, while our proposal

speciﬁcally targets the gaming domain.

Lastly, the authors provide a comprehensive

overview of existing models and advancements,

whereas our proposal concentrates on outlining ob-

jectives and steps for the development and imple-

mentation of a gaming-speciﬁc emotional recognition

model.

In essence, both works aim to enhance human in-

teraction through emotional recognition, but they di-

verge in focus, context, and level of detail.

2.4 A System of Emotion Recognition

and Judgment and Its Application

in Adaptive Interactive Game

In (Lin et al., 2023), the authors propose system for

recognizing and assessing emotions based on optimal

physiological signals for interactive gaming applica-

tions is established.

Ten participants played the Super Mario game

while their physiological responses were recorded to

assess the game’s effect on their emotions.

The results of this system were compared with

conventional machine learning methods, demonstrat-

ing the superiority of the former.

The system enabled the detection of emotional

changes in players during gameplay, enhancing their

experience.

It was observed that players’ perceptions of emo-

tional changes varied and that prior testing experience

affected the results. This underscores the effective-

ness of the proposed system and its potential to en-

hance interactivity in technology-based games.

Comparing (Lin et al., 2023) with our proposal,

the former provides concrete evidence of the effec-

tiveness of an emotion recognition system in gaming,

whereas our proposal outlines a plan for future devel-

opment in a related area.

Both contribute to the understanding and advance-

ment of emotion recognition in gaming, but they dif-

fer in their stages of development and presentation of

results.

2.5 Towards Automated Video Game

Testing: Still a Long Way to Go

In (Politowski et al., 2022), the authors explore the es-

calating complexity of game development, driving up

costs and necessitating larger, higher-quality games.

It delves into the challenges of manual playtesting,

particularly for smaller companies lacking in-house

QA teams.

There’s a growing interest in automated video

game testing, although skepticism persists among de-

Emotionalyzer: Player’s Facial Emotion Recognition ML Model for Video Game Testing Automation

939

Figure 1: Model architecture.

velopers.

Academic research highlights machine learning

and AI-based approaches, yet practical implementa-

tion remains a concern. Notable solutions like Wuji

and ICARUS show promise in automating game test-

ing processes.

Key challenges include maintaining game func-

tionality and addressing the lack of automated test

maintenance. Bridging the gap between theoretical

advancements and practical implementation is crucial

for the future of game testing.

Overall, while automated testing holds potential,

collaboration between academia and industry is es-

sential for its successful integration.

Comparing (Politowski et al., 2022) with our pro-

posal, both address the challenges and potential solu-

tions related to game development and testing.

In (Politowski et al., 2022), the authors discuss the

escalating complexity of game development, particu-

larly focusing on the challenges of manual playtest-

ing and the growing interest in automated video game

testing.

Similarly, our proposal acknowledges the need for

more efﬁcient and cost-effective testing methods, par-

ticularly in the context of smaller game development

companies.

3 CONTRIBUTION

The primary contribution of this research lies in the

application of facial emotion recognition techniques

to streamline the testing phase of video games.

By automating player observation and accurately

identifying the emotions experienced during testing

sessions, this approach transforms raw data into re-

ﬁned information that developers can readily use to

enhance their creations.

3.1 Preliminary Concepts

In this section, the primary concepts utilized in our

research are introduced.

Deﬁnition 1 (Facial emotion recognition (Vedantham

and Reddy, 2023)). Pertains to the ability to discern

emotional states based on gathered data, whether in

the form of video clips or images. It has been an area

of ongoing research development for several years.

Deﬁnition 2 (Game experience testing (Dumas and

Redish, 1993)). This stage is conducted prior to the

release of a video game and is essential for evaluating

the game from the user’s perspective and enhancing

their experience.

Deﬁnition 3 (Player experience (Nacke and Drachen,

2011)). It focuses on the qualitative aspects of player

interaction with games, taking into account factors

such as enjoyment and difﬁculty, among others.

Deﬁnition 4 (Sentiment analysis (Chaturvedi et al.,

2018)). Seeks to categorize text (and sometimes audio

and video) as either positive or negative. It is closely

linked to information retrieval and fusion since it in-

volves collecting, integrating, and classifying data.

It’s a complex research problem that involves address-

ing various NLP tasks, including named entity recog-

nition, concept extraction, sarcasm detection, aspect

extraction, and subjectivity detection.

In this preceding section, we delved into the con-

cepts of facial emotion recognition, player experi-

ence, and game experience testing.

3.2 Method

In this section, we detail the method we’ve developed

for our research, in which we utilize ”Darknet,” an

open-source neural network framework, for detecting

and classifying facial emotions.

The main contribution of this research consists of

applying a machine learning model capable of provid-

ing accurate emotion recognition during the play test

ICEIS 2025 - 27th International Conference on Enterprise Information Systems

940

of a video game, with the purpose of streamlining the

process.

In the Fig. 1, we can observe the beginning of the

process, from image capture via the webcam to the

output of results.

3.3 Video Capture

The process initiates with capturing video using a we-

bcam, continuously obtaining video frames.

3.4 Frame Extraction

The video stream is divided into individual frames

for further processing and analysis, as the model pro-

cesses single images rather than a continuous stream.

3.5 Image Preprocessing

The extracted frames are preprocessed to prepare

them for the convolutional neural network input.

3.6 Our Model

Convolutional Layer. : The preprocessed images

are processed through a convolutional layer, applying

ﬁlters to extract key features such as edges, textures,

and patterns important for emotion detection.

Max Pooling Layer. The output from the convolu-

tional layer undergoes max pooling, reducing the spa-

tial dimensions and emphasizing the most prominent

features to reduce computational complexity.

YOLO Detection. The processed features are input

into a YOLO (You Only Look Once) detection model,

customized using yolov4-tiny and the Darknet frame-

work for real-time object detection. YOLO detects

faces in the frame and identiﬁes emotions by labeling

them accordingly.

3.7 Result

The ﬁnal output is an image with each detected face

annotated with the corresponding emotion, displayed

in real time to show the detected emotions for each

face in the video frame.

4 EXPERIMENTS

In this section, we will cover the experiments con-

ducted in our project, the requirements for replicating

these experiments, and an analysis of the results ob-

tained from this process.

4.1 Experimental Protocol

This subsection provides information about the envi-

ronment setup for the experiments, including details

on the local hardware conﬁguration and the applica-

tions utilized.

This project was developed on a Google Colab

notebook using the L4 GPU. The model was built

with Darknet v4.5.4. The system speciﬁcations are

as follows:

• NVIDIA-SMI 535.104.05 Driver Version:

535.104.05

• CUDA Version: 12.2

4.2 Face Emotion Dataset

For training the model, two datasets were selected be-

cause the available public datasets did not provide a

sufﬁcient number of images to ensure effective train-

ing.

The ﬁrst dataset is FER-2013

, consisting of

grayscale facial images with a resolution of 48x48

pixels.

These images are automatically aligned to be cen-

tered and occupy a uniform space in each image.

The public test dataset contains 3,589 examples.

The second dataset is a public sample from the

AffectNet-HQ

dataset, which includes 29,042 ex-

amples with a resolution of 96x96 pixels.

To effectively integrate both datasets, a complete

standardization of the images to the .png format was

carried out, converting all images to grayscale and ad-

justing them to a uniform resolution of 48x48 pixels.

During the preparation of the training set, a thor-

ough review was conducted to identify and remove

images that did not show faces, thus ensuring the

quality and consistency of the ﬁnal dataset.

Additionally, emotional labels were assigned to

organize the images into corresponding folders. Each

folder was then divided into two subsets: one for

training and one for validation, distributed in a 70%

and 30% ratio, respectively.

This approach ensures a robust and well-

structured preparation for the facial emotion classi-

ﬁcation model.

Public FER-2013 dataset sample: https://www.kaggle.

com/datasets/msambare/fer2013

Public AffectNet-HQ dataset sample: https://www.

kaggle.com/datasets/noamsegal/affectnet-training-data

Emotionalyzer: Player’s Facial Emotion Recognition ML Model for Video Game Testing Automation

941

Table 1: Our model results by emotions.

Name

Avg.

Precision

TP FN FP TN Accuracy

Error

Rate

Precision Recall Speciﬁcity

False

Pos

Rate

angry 74.1064 17 3 24 59 .7379 .2621 .4146 .8500 .7108 .2892

disgust 66.8204 19 1 29 70 .7479 .2521 .3958 .9500 .7071 .2929

fear 62.7131 12 8 15 42 .7013 .2987 .4444 .6000 .7368 .2632

happy 67.2481 13 7 14 63 .7835 .2165 .4815 .6500 .8182 .1818

sad 74.7543 17 3 10 52 .8415 .1585 .6296 .8500 .8387 .1613

surprise 58.9889 18 2 30 54 .6923 .3077 .3750 .9000 .6429 .3571

neutral 53.0684 16 4 34 68 .6885 .3115 .3200 .8000 .6667 .3333

Table 2: Our model general results.

Precision .42

recall .80

F1-score .55

TP 112

FP 156

FN 28

Mean avg.precision .653856

Table 3: Models Comparison.

Model Avg. Precision

MobileNet v1 1.00 224 69.62%

MobileNet v1 0.75 160 66.72%

RANDA 88.71%

R-emo 65.4%

Our Model 65.39%

4.3 Results

In this subsection, the experiments carried out and the

results obtained in each of these are detailed.

As depicted in Table 2, the model evaluation re-

vealed a precision of 0.42, indicating that 42% of

positively classiﬁed detections were accurate. Fur-

thermore, the recall achieved was 0.8, signifying the

model’s correct identiﬁcation of 80% of positive in-

stances within the dataset.

The resulting F1-score stood at 0.55, harmonizing

precision and recall into a consolidated metric. The

assessment identiﬁed 112 true positives alongside 156

false positives and 28 false negatives.

The average Intersection over Union (IoU) was

33.63%, representing the mean overlap between pre-

dicted bounding boxes and ground truth. Applying an

IoU threshold of 50%, the average precision reached

65.39%, evaluating the model’s object detection ca-

pability across varying conﬁdence thresholds.

Considering the precision metric and comparing

our model with other approaches, it can be observed

from Table 3 that our model shows the lowest pre-

cision percentage, with a difference of 23.32% com-

pared to the reference model, which was RANDA.

ICEIS 2025 - 27th International Conference on Enterprise Information Systems

942

5 CONCLUSIONS

Based on our ﬁndings, the current model demon-

strates acceptable performance in recognizing facial

emotions during video game testing. However, there

is considerable potential for improvement.

One key area for enhancement is the incorporation

of facial landmarks in future evaluations. These land-

marks can provide more detailed information about

facial expressions, which could signiﬁcantly improve

the model’s accuracy.

Additionally, ﬁne-tuning the model parameters

through more extensive training and validation could

further enhance its performance by reducing false

positive rates. When compared to the RANDA model,

our current model exhibits lower accuracy, underscor-

ing the necessity for additional optimizations.

These optimizations might include reﬁning the

feature extraction process, experimenting with differ-

ent machine learning algorithms, or employing more

sophisticated data augmentation techniques to better

handle the variability in facial expressions. By ad-

dressing these areas, we aim to achieve precision lev-

els that are comparable to, or even surpass, those of

the RANDA model.

Moreover, conducting more comprehensive test-

ing with a larger and more diverse dataset could help

identify speciﬁc weaknesses and areas for further re-

ﬁnement. (Burga-Gutierrez et al., 2020) Continuous

iteration and feedback from real-world testing scenar-

ios will be crucial in evolving our model to meet the

high standards required for effective emotion recog-

nition in video game development.

Looking forward, we aim to integrate our model

into a computer application designed for real-time

analysis of facial emotions during video game testing.

(Guillermo et al., 2023) This application will lever-

age the improved accuracy and reduced false positive

rates achieved through incorporating facial landmarks

and ﬁne-tuning model parameters.

By enabling real-time emotion detection, this tool

could provide invaluable insights into player experi-

ences, helping developers identify areas of frustra-

tion, excitement, or disengagement. (de Rivero et al.,

2023) This immediate feedback can streamline the de-

velopment process, allowing for timely adjustments

to improve overall game design and user experience.

The development of this application will also

involve optimizing the model’s computational efﬁ-

ciency to ensure it operates effectively within the con-

straints of real-time processing during video game

testing sessions.

REFERENCES

Blom, P. M., Bakkes, S., Tan, C. T., Whiteson, S., Roijers,

D. M., Valenti, R., and Gevers, T. (2014). Towards

personalised gaming via facial expression recognition.

In Horswill, I. and Jhala, A., editors, Proceedings of

the Tenth AAAI Conference on Artiﬁcial Intelligence

and Interactive Digital Entertainment, AIIDE 2014,

October 3-7, 2014, North Carolina State University,

Raleigh, NC, USA. AAAI.

Burga-Gutierrez, E., Vasquez-Chauca, B., and Ugarte, W.

(2020). Comparative analysis of question answering

models for HRI tasks with NAO in spanish. In SIM-

Big, volume 1410 of Communications in Computer

and Information Science, pages 3–17. Springer.

Chaturvedi, I., Cambria, E., Welsch, R. E., and Herrera, F.

(2018). Distinguishing between facts and opinions for

sentiment analysis: Survey and challenges. Inf. Fu-

sion, 44:65–77.

de Rivero, M., Tirado, C., and Ugarte, W. (2023). Formal-

styler: Gpt-based model for formal style transfer with

meaning preservation. SN Comput. Sci., 4(6):739.

Dumas, J. S. and Redish, J. C. (1993). A practical guide to

usability testing. Intellect.

El-Nasr, M. S., Drachen, A., and Canossa, A., editors

(2013). Game Analytics, Maximizing the Value of

Player Data. Springer.

Guillermo, L., Rojas, J., and Ugarte, W. (2023). Emotional

3d speech visualization from 2d audio visual data.

Int. J. Model. Simul. Sci. Comput., 14(5):2450002:1–

2450002:17.

Kit, N. C., Ooi, C.-P., Tan, W. H., Tan, Y.-F., and Cheong,

S.-N. (2023). Facial emotion recognition using deep

learning detector and classiﬁer. International Jour-

nal of Electrical and Computer Engineering (IJECE),

13(3):3375–3383.

Lin, W., Li, C., and Zhang, Y. (2023). A system of emo-

tion recognition and judgment and its application in

adaptive interactive game. Sensors, 23(6):3250.

Nacke, L. and Drachen, A. (2011). Towards a framework of

player experience research (pre-print). In Foundations

of Digital Games Conference.

Politowski, C., Gu

eneuc, Y., and Petrillo, F. (2022). To-

wards automated video game testing: Still a long way

to go. In 6th IEEE/ACM International Workshop on

Games and Software Engineering, GAS@ICSE, Pitts-

burgh, PA, USA, May 20, 2022, pages 37–43. ACM.

Vedantham, R. and Reddy, E. S. (2023). Facial emotion

recognition on video using deep attention based bidi-

rectional LSTM with equilibrium optimizer. Multim.

Tools Appl., 82(19):28681–28711.

Wang, Y., Song, W., Tao, W., Liotta, A., Yang, D., Li, X.,

Gao, S., Sun, Y., Ge, W., Zhang, W., and Zhang, W.

(2022). A systematic review on affective computing:

emotion models, databases, and recent advances. Inf.

Fusion, 83-84:19–52.

Emotionalyzer: Player’s Facial Emotion Recognition ML Model for Video Game Testing Automation

943