Research on Machine Learning Models for Predicting Player Churn

Lilin Peng

Brunel College London, North China University of Technology, No. 5 Jinyuanzhuang Road,

Shijingshan District, Beijing, China

Keywords: Churn Prediction, Player Behaviors, Game Analytics, Machine Learning.

Abstract: In recent years, with the rapid development of the gaming industry, predicting player churn to improve game

retention has become a key research area. This review summarizes the main research achievements and

methods in the field of player churn prediction in recent years. By analyzing various machine learning

algorithms such as random forest, decision tree, and logistic regression, the performance of these models in

processing player behavior data and improving prediction accuracy is summarized. These studies demonstrate

the effectiveness of data-driven methods in predicting player behavior, particularly when using long-term data

frames, resulting in significantly improved prediction accuracy. In addition, it also indicates that incorporating

personalized behavior and social relationships of players in the prediction model can enhance the accuracy of

the model. In addition, this article explores the potential applications of cutting-edge methods such as multi-

source data fusion, real-time prediction and intervention, and long-term behavior analysis. The review

concludes that future research should continue to focus on algorithm optimization and the application of

emerging technologies to further improve the accuracy and adaptability of player churn prediction models,

providing a scientific basis for game developers to develop more effective user retention strategies.

1 INTRODUCTION

User churn refers to the group of users who have

previously visited or registered on a website, but have

gradually lost interest in the website and left it,

ultimately leaving the website completely.

Cheng Wang et al. proposed a planar fusion deep

learning model based on domain ranking, namely

Fully Connected Layered Convolutional Neural

Network-Long Short-Term Memory (FCLCNN-

LSTM), for assessing the risk of customer churn in

the telecommunication industry (Wang et al., 2024).

The model incorporates a majority least absolute

shrinkage and selection operator (Maj-LASSO)

algorithm for feature selection under unbalanced data

conditions, addressing feature importance through a

fully connected layer with several Rectified Linear

Unit (ReLU) neurons. By combining a 2D

convolutional neural network for spatial feature

extraction and an LSTM model for temporal

information, the FCLCNN-LSTM model enhances

classification accuracy and generalizability. It was

tested on three public datasets, outperforming

https://orcid.org/0009-0004-3597-0530

traditional models such as Logistic Regression,

Support Vector Machine, and Random Forest, with an

accuracy improvement of 3.43% and an area under

the curve (AUC) increase of 4.84%. This approach

provides a robust decision-making tool for

identifying potential churners in the

telecommunications sector.

Player churn prediction is crucial in the gaming

sector. Since the cost of attracting new players is

much higher than retaining existing players,

predicting when players are likely to churn and taking

preventive measures can help gaming companies

enhance player retention, thereby increasing revenue.

It is an important strategy for boosting profitability

and carries commercial value in improving player

retention rates. This article summarizes experimental

results from multiple research papers, comparing the

performance of different models in player churn

prediction and proposing directions for future

research.

Player churn refers to the phenomenon where

players stop using a game. Player behavior data

includes factors such as playtime, purchase history,

and social interactions. The goal of player churn

112

Peng, L.

Research on Machine Learning Models for Predicting Player Churn.

DOI: 10.5220/0013234700004558

In Proceedings of the 1st International Conference on Modern Logistics and Supply Chain Management (MLSCM 2024), pages 112-121

ISBN: 978-989-758-738-2

prediction is to identify which players are likely to

stop playing the game within a short period. These

studies primarily focus on utilizing machine learning

models to analyze player behavior data to predict

future behavior.

Existing research indicates that different types of

games and player behavior patterns significantly

impact prediction results.

Mao Ping Ting and Cho Dong Min examine user

dissatisfaction with serious games about Chinese

cultural heritage and attribute it to problems with

game quality and lack of long-term feedback from

users (Ting and Min, 2024). The study identifies two

primary quality dimensions—online gaming factors

and intrinsic serious game factors—using the

theoretical framework of Stimulus-Organism-

Response (SOR) in combination with the Technology

Acceptance Model (TAM). Through analysis of 534

valid responses, the research highlights that

educational objectives, entertainment experience,

quality of the system, and game design significantly

affect how users perceive the usefulness and ease of

use, which in turn impact user churn. The study

emphasizes the critical role of learning objectives and

entertainment experience in reducing churn, offering

practical insights for enhancing game design and user

retention.

Jiao Wang used the COX proportional hazards

model to predict the churn of online game players in

Company A (Wang, 2018). Research has found that

Company A has problems in player churn

management, such as insufficient reference indicators,

a single way to maintain player stability, and a lack of

retention evaluation. To this end, the paper selected

13 variables from three aspects: performance, player

activity, and time interval characteristics, and

screened these variables using the forward LR

method to construct a COX proportional hazards

model. The final research results indicate that 10

variables, including level, combat effectiveness, and

total online time, are closely related to player churn.

Based on the Recency, Frequency, Monetary Model

(RFM) model combined with the coefficient of

variation method, a comprehensive evaluation of

players was conducted, and the first and third types of

players were identified as key retention targets.

Minghan Tu explored the reasons behind mobile

game player churn and the impact of gender

differences by constructing a user churn conceptual

model based on the SOR model, combined with the

Expectation Confirmation Model (ECM) and the

Perceived Entertainment Theory (Tu, 2020). In this

model, external stimuli (S) are divided into two parts:

game factors and social factors. The organism (O)

includes perceived entertainment, expectation

confirmation, perceived usefulness, and satisfaction.

The response (R) is user churn. The research results

indicate that game factors have a significant impact

on perceived entertainment, expectation confirmation,

and user churn, making them a key focus for

companies. Although social factors do not have a

substantial effect on user churn, they do significantly

affect satisfaction. The study also discovered that

gender differences influence the connection between

game elements and user churn, providing companies

with more precise strategies for mobile game product

development.

Wei Zhang developed a player churn prediction

model based on analyzing nearly 200,000 pieces of

user behavior data (Zhang, 2021). The study first

preprocessed the dataset and used regression models

to fill in missing values. Players were then

categorized into two groups—normal and churned—

based on a customized login time interval. Nine

feature variables were selected for modeling based on

variable correlation and importance. The dataset was

split into training and testing sets with a 70:30 ratio,

and decision tree, random forest, and neural network

models were built. The parameters of each model

were repeatedly adjusted to achieve optimal

prediction performance. The results showed that the

recall rates of individual models could be further

improved. Finally, the study attempted to linearly

combine the individual models and customized the

threshold for binary classification, which enhanced

the sensitivity in predicting churned users. The

findings indicate that the combined model effectively

improves the recall rate and is suitable as the final

prediction model.

2 METHODOLOGIES

2.1 Dataset Collection and

Preprocessing

The dataset is typically sourced from the server logs

of game companies, containing player behavior data

such as gameplay time, login frequency, session

duration, and more. Data preprocessing includes

managing absent values, addressing outliers, and

normalizing the data. Additionally, it is necessary to

define the observation period and behavioral metrics

to facilitate model training and evaluation.

In this study, the experimental data is derived

from multiple research papers. The specific dataset

introduction is shown in Table 1.

Research on Machine Learning Models for Predicting Player Churn

113

Table 1: Dataset.

Literature name Dataset

Early churn prediction

with personalized targeting

in mobile social games

(Milošević et al., 2017)

Using data from two

million players of the

game "Top Eleven -

Become a Football

Manager".

Modelling Early User-

Game Interactions for

Joint Estimation of

Survival Time and Churn

Probability (Bonometti et

al., 2019)

Using data from six

different games from

Square Enix Limited:

Hitman Go (hmg),

Hitman Sniper (hms), Just

Cause 3 (jc3), Just Cause

4 (jc4), Life is Strange

(lis), and Life is Strange:

Before the Storm (lisbf).

The dataset contains

information on Churners

and Non-Churners in each

game, with 80000

Churners and 80000 Non-

Churners for each game.

The observation period of

data is defined as the

number of game sessions

that users have from their

first interaction to

reaching a specific cutoff

oint.

Exploiting limited players’

behavioral data to predict

churn in gamification

(Loria & Marconi, 2021)

Use data from gamified

applications (Play&Go)

and casual online F2P

games (TagPro).

Match experiences affect

interest: Impacts of

matchmaking and

performance on churn in a

competitive game (Kang et

al., 2024)

Based on data from the

multiplayer online

competitive game

'Everyone's Marble',

using game logs covering

42 days and game data

from over 260000

players, including records

of over 6 million matches

in total.

Predicting subscription

renewal using binary

classification in World of

Warcraft (Hossain et al.,

2023)

Using a dataset called

"Warcraft Avatar History

(WoWAH)" that includes

records from January

2006 to January 2009.

The dataset includes

player data recorded

ever

10 minutes.

2.2 Feature Extraction and Selection

Feature extraction is a critical step in churn

prediction. Commonly used features include

gameplay time, session frequency, win rate, and

activity index. Feature selection is performed using

methods like Principal Component Analysis (PCA) or

Recursive Feature Elimination (RFE) to identify the

most predictive features.

The study "Match experiences affect interest:

Impacts of matchmaking and performance on churn

in a competitive game" (Kang et al., 2024) defined

variables related to matchmaking experience and

player performance, including average skill gap, skill

gap variance, win rate, winning streak rate, and losing

streak rate. By extracting and selecting these features,

the impact of match experience on churn was

analyzed.

2.3 Basic Model Introduction

Model training uses various machine learning

algorithms for churn prediction. Common evaluation

metrics include Accuracy, AUC, F1 Score, etc.

Firstly, this article will introduce several basic

machine learning models.

2.3.1 Logistic Regression (LR)

Logistic regression is a fundamental but effective

binary classification model commonly used for churn

prediction. Its advantages lie in the simplicity of the

model, ease of explanation, and good performance in

many application scenarios. However, logistic

regression cannot capture the complex nonlinear

relationships between features, which limits its

performance on complex datasets.

2.3.2 Decision Tree (DT) and Random

Forest (RF)

Decision Tree is a tree-structured model that predicts

outcomes by iteratively dividing the dataset into

smaller subsets. It is easy to understand and interpret,

capable of handling various data types, and is suitable

for scenarios that require strong interpretability and

do not have strict requirements on data distribution.

Random forests improve prediction performance and

robustness by constructing multiple decision trees

and averaging them.

Decision trees and random forests are two

methods that work well with complex data and

capture nonlinear relationships between features.

2.3.3 Gradient Boosting (GB)

Gradient boosting is a method of boosting that

gradually reduces prediction errors by constructing

multiple weak learners (such as decision trees). Its

advantage is high precision, which can capture

complex data patterns and is suitable for handling

MLSCM 2024 - International Conference on Modern Logistics and Supply Chain Management

114

nonlinear relationships. Suitable for scenarios that

require high prediction accuracy and can accept

longer training times.

2.3.4 Naive Bayes (NB)

Naive Bayes is a straightforward probabilistic

classifier grounded in Bayes' theorem, assuming that

features are independent of each other. Its advantages

include fast computation and good performance on

small datasets. It is suitable for scenarios where

feature independence is relatively strong.

2.3.5 ElasticNet (EN)

This model combines the advantages of ridge

regression and LASSO regression, and controls the

model complexity through two regularization

parameters. The effect of handling multicollinearity

problems is significant and suitable for high-

dimensional data. Suitable for scenarios where both

variable selection and model interpretability need to

be considered simultaneously.

2.3.6 The Multilayer Perceptron

It is a basic neural network structure suitable for

dealing with nonlinear relationships.

 MLPr (Multilayer Perceptron Regressor) is a

neural network regression model that uses a multi-

layer perceptron (MLP) to achieve nonlinear

mapping.

 MLPc (Multilayer Perceptron Classifier) is a

neural network classification model that uses a MLP

to achieve nonlinear classification.

Both methods are capable of handling complex

nonlinear relationships and are suitable for large-

scale datasets. And it is suitable for scenarios that

require high classification accuracy and large

amounts of data.

These models predict the probability of a player's

loss by analyzing historical behavior data.

3 EXPERIMENTAL ANALYSIS

AND DISCUSSION

3.1 Interpretation of evaluation

indicators

Table 2 shows the various evaluation indicators and

their corresponding meanings.

Table 2: Interpretation of evaluation indicators.

Evaluation In

icators Inter

retation

SMAPE (Symmetric

average absolute

percentage error)

Indicates the disparity

between the model's predicted

value and the actual value.

The lower the value, the better

the model performs in

estimating survival time.

F1 Score It represents the harmonic

mean of precision and recall.

The higher the value, the

better the model performs in

estimatin

churn

robabilit

AUC Reflects the overall

effectiveness of the model in

classification tasks.

A higher value signifies better

classification

erformance.

Precision The percentage of samples

predicted as positive by the

model that are genuinely

ositive.

Recall The proportion of actual

positive examples that are

correctly identified as

ositive.

3.2 Multiple Machine Learning

Algorithms

3.2.1 Literature Review

Miloš Milošević et al. proposed a method for

predicting early customer churn in mobile social

games and preventing it using personalized push

notifications (Milošević et al, 2017). The system

consists of two stages: churn prediction (framed as a

binary classification task) and churn prevention

through personalized push notifications. The study

utilized data from the game "Top Eleven - Be a

Football Manager," involving two million players,

and trained several models, including LR, DT, RF,

NB, and GB.

The models used the player's first-day behavior

data to predict churn, with churn defined as a player

not returning to the game within 14 days of their last

activity. The study sent personalized push

notifications to predicted churn users based on their

most interacted game features and sent notifications

encouraging users to explore core game features they

had not yet interacted with. Among the tested models,

Gradient Boosting performed the best. The

personalized notifications were shown to effectively

reduce churn by up to 28%.

Research on Machine Learning Models for Predicting Player Churn

115

3.2.2 Model Performance

Below is the performance of models across different

evaluation metrics (AUC, Precision, Recall, F1

Score).

Figure 1: Model Performance. (Photo/Picture credit:

Original).

According to Figure 1, observing the AUC

metrics of each model, it was found that the Gradient

Boosting algorithm had the highest AUC, reaching

0.83, indicating its overall predictive ability was the

best. The Random Forest algorithm follows closely

behind with an AUC of 0.80. The AUC of logistic

regression is 0.79, which also performs well. The

AUC of Naive Bayes and Decision Tree are relatively

low, at 0.73 and 0.67, respectively.

Observing the accuracy index, it is found that the

random forest algorithm outperformed the others in

terms of accuracy, reaching 0.76. The accuracy of

gradient boosting algorithm and logistic regression is

also relatively high, at 0.75 and 0.73, respectively.

Naive Bayes has the lowest accuracy, at 0.66.

Observing the recall index, it was found that Naive

Bayes performed outstandingly in recall, reaching

0.89, indicating its effectiveness in identifying lost

users. The recall rates of gradient boosting algorithm

and logistic regression are 0.84 and 0.83, respectively,

which also perform well. The recall rate of the

decision tree is the lowest, at 0.68.

Observing the F1 Score indicator, it is discovered

that the gradient boosting algorithm achieved the

highest, reaching 0.76, indicating its best

performance in balancing accuracy and recall. The F1

scores of random forest and logistic regression are

relatively close, at 0.74 and 0.74, respectively. Naive

Bayes has a low F1 score of 0.69, although it has a

high recall rate, its accuracy is low.

3.2.3 Summary

In summary, the Gradient Boosting algorithm

performs very well on all metrics, especially in AUC

and F1 scores, indicating that it is most effective in

churn prediction. Random Forest algorithm also

performs strongly, especially in terms of accuracy,

but its recall rate is slightly lower than that of gradient

boosting algorithm. Logistic regression performs well

overall and is a simple and effective choice. Naive

Bayes performs well in recall, but due to its low

accuracy, its overall F1 score is not high. Decision

trees perform relatively poorly on all metrics and may

not be suitable for this type of churn prediction task.

Therefore, it is recommended to use gradient boosting

algorithm or random forest algorithm for churn

prediction, as they perform the best in balancing

accuracy and recall.

3.3 Bifurcating Model

3.3.1 Literature Review

Bonometti et al. conducted three experiments to

model early user interactions in games for joint

estimation of survival time and churn probability

(Bonometti et al. 2019). The study compared various

models, including EN, MLPr, LR, and MLPc, across

different datasets.

Experiment One focused on evaluating the

performance of EN and MLPr in estimating survival

time, and LR and MLPc in predicting churn

probability. The results indicated that MLPr

outperformed EN in survival time estimation, while

MLPc surpassed LR in churn prediction.

Experiment Two expanded the dataset to include

all data points, rather than summary statistics, to test

model performance. The unfolded data slightly

improved survival time estimation, particularly with

the MLPr model, and enhanced the performance of

the MLPc model in churn prediction.

Experiment Three introduced a newly proposed

Bifurcation Model (BM) using the unfolded data. BM

outperformed the other models in both survival time

and churn probability estimation, showing superior

SMAPE values for survival time and F1 scores for

churn prediction.

These findings suggest that incorporating early

user interaction metrics and explicit time structures,

such as those used in LSTM models, can significantly

improve the accuracy and reliability of predicting

future player engagement and churn.

3.3.2 Model Comparison

This paper collates the experimental data in this study

and generates the corresponding summary chart.

MLSCM 2024 - International Conference on Modern Logistics and Supply Chain Management

116

Figure 2: Based on SMAPE. (Photo/Picture credit:

Original).

Figure 2 shows the performance of three

models—ElasticNet, MLPr, and the Bifurcating

Model—across different games. The figure indicates

that the Bifurcating Model exhibits the most stable

performance across all games and has the lowest

SMAPE (Symmetric Mean Absolute Percentage

Error), meaning it has the smallest prediction error.

The ElasticNet model generally performs worse than

the other two models in most games, with particularly

high variability in the hms game. The MLPr model

shows significant improvement over ElasticNet, but

its performance in some games, such as jc3 and lis, is

slightly inferior to that of the Bifurcating Model.

Figure 3: Based on F1 Score. (Photo/Picture credit:

Original).

Figure 3 displays the performance of Logistic

Regression, MLPc, and the Bifurcating Model across

different games. The Bifurcating Model consistently

achieves high F1 Scores in all games, demonstrating

its stability and accuracy in predicting player churn.

Logistic Regression shows significant variability in

performance across different games, with notably low

F1 Scores in the hmg and lisbf games, indicating poor

predictive accuracy. The MLPc model outperforms

Logistic Regression, especially in the hms and jc3

games, but still does not match the performance of the

Bifurcating Model in all games.

3.3.3 Summary

By comparing the SMAPE and F1 values across

different games, it is clear that the Bifurcating Model

excels in both estimating survival time and predicting

churn probability. In contrast, the ElasticNet and

Logistic Regression models perform relatively

poorly, particularly when faced with different game

types, showing instability. The MLP models perform

well in some cases but are generally outperformed by

the Bifurcating Model.

3.4 Random Forest

3.4.1 Literature review

Enrica Loria and Annapaola Marconi used data from

gamification applications (Play&Go) and casual

online F2P games (TagPro) (Loria & Marconi, 2021).

A random forest model was trained on gamified

application data and validated on F2P game data,

testing different sampling methods to select player

records for training the model. And participation

features such as frequency of use, game actions, and

integration were used to analyze the original form and

embedded performance impact values of these

features. The research results indicate that these

features can be effectively used for churn prediction,

and through the visualization and interpretation of the

tree, it was found that the time invested by players in

the game is directly proportional to their likelihood of

churn. Game time and usage frequency have been

found to be significant predictors of churn.

3.4.2 Model Performances

The following is the performance of various

selections across different evaluation metrics (MSE:

Mean Squared Error, RMSE: Root Mean Squared

Error, MAE:

Mean Absolute Error, R2: R-Squared).

Research on Machine Learning Models for Predicting Player Churn

117

Figure 4: Regression Metrics. (Photo/Picture credit :

Original).

Figure 5: Classification Metrics (Part 1). (Photo/Picture

credit : Original)

Figure 6: Classification Metrics (Part 2). (Photo/Picture

credit: Original).

Upon observing Figure 4, it is evident that the

Gaussian Selection method performs the best across

all regression metrics (MSE, RMSE, MAE, and R2),

indicating its superior ability in predicting churn time.

In contrast, the performance of Random Selection and

Uniform Selection is relatively similar, but Gaussian

Selection significantly outperforms both of these

methods.

Looking at Figures 5 and 6, it becomes clear that

Gaussian Selection also excels in classification

metrics (Accuracy, Precision, Recall, and F1 Score).

Additionally, the classification Receiver Operating

Characteristic Curve-Area Under the Curve (ROC-

AUC) for Gaussian Selection is higher than that of the

other methods, highlighting its strong discriminative

power in classification tasks.

3.4.3 Summary

In summary, the Gaussian Selection method shows

optimal performance across most metrics,

particularly in terms of accuracy and precision in both

regression and classification tasks. While the Random

Selection method also performs well, it is slightly

inferior to Gaussian Selection. The Uniform Selection

method, however, performs the worst across all

metrics, making it less suitable as a selection method

for churn prediction.

3.5 Matching algorithm

3.5.1 Literature Review

Hyunjae Kang et al. conducted a comprehensive

analysis of the relationship between players' matching

experience and churn in the competitive game

"Everyone's Marble" by utilizing a fixed effects

model (Kang et al., 2024). Analyzing match records

from over 260,000 players across 42 days,

encompassing more than 6 million matches, the study

explored how matching mechanisms, competition

outcomes, and player performance influence churn.

The study found that matching with stronger

opponents increases the likelihood of churn, while

matching with weaker opponents reduces it.

Significant differences in opponent skill levels were

shown to positively impact churn, whereas frequent

and consecutive victories were found to reduce

attrition. Additionally, the study highlighted that

consecutive defeats have varying effects on churn

depending on the player's skill level.

These findings offer both theoretical insights and

practical implications for understanding user churn in

competitive games and provide valuable guidance for

game developers aiming to enhance user retention.

3.5.2 Algorithms Impacts

The following are the impacts of different matching

algorithms and performance metrics on player churn.

Matchmaking Algorithm A refers to the Elo

Rating System. The Elo Rating System is a classic

probabilistic model used for matching players based

on their skill levels. In the literature, the Elo Rating

System is employed to improve matchmaking quality,

allowing players to compete against others with

similar skill levels.

MLSCM 2024 - International Conference on Modern Logistics and Supply Chain Management

118

Figure 7: Impacts of matching algorithms. (Photo/Picture

credit : Original).

Matchmaking Algorithm B refers to the TrueSkill

System. TrueSkill is an improvement over the Elo

Rating System, enhancing he precision and flexibility

of skill ratings, with the goal of providing a fairer

matchmaking experience for players.

Performance Metric A refers to Win Rate. Win

Rate represents the percentage of games a player wins

over a certain period and is a key indicator of player

performance. The literature suggests that a higher

Win Rate can reduce player churn.

Performance Metric B refers to Winning Streak

Rate and Losing Streak Rate. Winning Streak Rate is

the proportion of consecutive wins within a day,

while Losing Streak Rate is the proportion of

consecutive losses. These metrics are used to analyze

the impact of winning or losing streaks on player

churn.

Based on Figure 7, the following findings and

conclusions can be drawn:

Matchmaking Algorithms A and B primarily

influence the reduction of player churn rates. The

churn rate reduction for Matchmaking Algorithm A is

0.05, while for Matchmaking Algorithm B, it is 0.07.

Performance Metrics A and B primarily influence the

increase in player engagement. The increase in player

engagement for Performance Metric A is 0.10, and

for Performance Metric B, it is 0.12.

Matchmaking Algorithm B is more effective in

reducing player churn compared to Matchmaking

Algorithm A, indicating that a better matchmaking

algorithm can significantly decrease player churn.

Performance Metric B shows the best performance in

increasing player engagement, followed by

Performance Metric A. This suggests that optimizing

game performance and providing a better player

experience can significantly enhance player

engagement.

3.5.3 Summary

In conclusion, game developers should prioritize

improving matchmaking algorithms, particularly

efficient ones like Matchmaking Algorithm B, to

reduce churn rates. Additionally, attention should be

given to optimizing performance metrics, especially

Performance Metric B, to increase player engagement

and satisfaction.

Overall, by improving matchmaking algorithms

and optimizing performance metrics, game

companies can effectively reduce player churn rates

and increase player engagement, thereby enhancing

the overall performance of the game and user

retention.

3.6 Binary Classification Model

3.6.1 Literature Review

Md. Yousuf Hossain et al. compared data on player

history, class, and guild over time in World of

Warcraft (WoW) and estimated the probability of

renewal (Hossain et al., 2023). This study used LR,

DT, linear discriminant analysis (LDA), quadratic

discriminant analysis (QDA), RF, and NB algorithms

as binary classifiers, and trained and tested the models

using a 10-fold cross validation method to compare

their performance. The research results indicate that

the RF and DT algorithms provide the highest

accuracy (over 90%) in all cases. This is

accomplished by utilizing levels and guilds as

parameters, as well as extracting active days and total

game time from various time periods of the previous

month.

3.6.2 Model Performances

The following outlines the performance of various

machine learning algorithms across different

evaluation metrics.

By observing Table 3-5, it was found that RF and

DT performed the best in subscription renewal

prediction tasks, especially when using data over a

longer time range (such as 1-30 days or including 16-

30 days), with an accuracy rate of over 96%. Despite

the longer training

and scoring time of RF, its predictive performance is

significantly better than other models, especially in

key performance metrics such as accuracy, recall, F1

score, and AUC-ROC. DT achieve a good balance

between performance and computational efficiency,

making them suitable for resource limited

environments. In contrast, LDA and LR perform

moderately well, while NB and QDA perform poorly,

especially when dealing with complex subscription

renewal prediction tasks.

Research on Machine Learning Models for Predicting Player Churn

119

Table 3: Subscription Renewal (1-30 Days). (Hossain et al., 2023).

Model Accuracy Precision Recall F1 score AUC ROC

RF 0.906488 0.838773 0.77583 0.901711 0.890999

DT 0.9018 0.821999 0.778555 0.898253 0.855967

LR 0.854728 0.865415 0.530796 0.797524 0.703723

LDA 0.857118 0.733282 0.596764 0.828296 0.699635

QDA 0.843145 0.683084 0.625792 0.828696 0.731092

NB 0.824722 0.653181 0.636632 0.819722 0.711776

Table 4: Subscription Renewal (16-30 Days). (Hossain et al., 2023).

Model Accuracy Precision Recall F1 score AUC ROC

RF 0.94192 0.893297 0.879788 0.941413 0.963323

DT 0.937915 0.879352 0.88301 0.938064 0.920068

LR 0.854933 0.852507 0.533484 0.798971 0.704236

LDA 0.856416 0.72849 0.599294 0.828767 0.699565

QDA 0.842045 0.681928 0.631716 0.829468 0.724304

NB 0.800357 0.631051 0.646273 0.805868 0.714753

Table 5: Subscription Renewal (1-15, 16-30, 1-30 Days). (Hossain et al., 2023).

Model Accuracy Precision Recall F1 score AUC ROC

RF 0.966464 0.916762 0.906857 0.954368 0.978386

DT 0.965346 0.912339 0.916879 0.955468 0.965803

LR 0.867574 0.826457 0.546759 0.807272 0.703612

LDA 0.886416 0.72849 0.599294 0.828767 0.69565

QDA 0.777983 0.618308 0.652812 0.792026 0.7045

NB 0.651834 0.589744 0.620902 0.632991 0.708269

3.6.3 Summary

Overall, RF and DT are the preferred models,

especially in scenarios that require high accuracy, as

they can provide strong support for developers, while

decision tree models can provide effective predictions

in situations where computing resources are limited.

4 DISCUSSION

The current research on player churn prediction faces

many challenges, including data acquisition and

processing, model complexity, adaptability and so on.

Further research can explore optimizing data

utilization, improving model accuracy, and

expanding application scope, providing more refined

and efficient solutions for the gaming industry.

In the future, the following improvements can be

considered:

 Multisource Data Integration

Integrate data from various sources (such as social

media interactions, in-game purchase behavior, etc.)

into the predictive model to enhance its accuracy and

comprehensiveness.

 Real-Time Prediction and Intervention

Develop real-time churn prediction systems

capable of instantly identifying high-risk players and

implementing personalized intervention strategies to

increase player retention.

MLSCM 2024 - International Conference on Modern Logistics and Supply Chain Management

120

 Long-Term Behavior Analysis

Extend the research timeframe to analyze long-

term player behavior patterns and identify key factors

that influence long-term engagement and loyalty.

 Application of Machine Learning and Deep

Learning

Investigate more sophisticated machine learning

and deep learning algorithms, including

reinforcement learning and graph neural networks, to

improve the performance and adaptability of

predictive models.

5 CONCLUSION

This paper highlights the significance of player churn

prediction in the gaming industry. Whether for

subscription-based games or free-to-play games,

accurately predicting when players are likely to churn

is crucial for developing effective retention strategies.

This not only helps game developers maintain player

engagement but also increases revenue and extends

the lifecycle of the game.

Through the analysis of multiple research papers,

the study demonstrates that using player behavior

data for prediction is highly effective. Various

algorithms, such as Random Forest, Decision Trees,

and Logistic Regression, are widely applied and show

high prediction accuracy. Among them, Random

Forest and Decision Tree models generally perform

the best, especially when handling complex player

behavior data. Additionally, the paper highlights that

different time frames have a significant impact on

prediction accuracy. Using data over a longer period

(e.g., 30 days) typically improves prediction accuracy,

indicating that long-term player behavior patterns are

more accurate indicators of future engagement than

short-term behaviors.

Furthermore, the paper underscores the crucial

role of incorporating personalized factors and social

interactions in predictive models. For instance,

factors such as a player's social network within the

game, guild membership, and the number of active

days can significantly influence the probability of

churn. This suggests that predictive models should

not only consider individual player behaviors but also

integrate social interactions to improve prediction

precision. Optimizing matchmaking algorithms and

game performance can effectively reduce player

churn and enhance engagement.

The paper also discusses how to make effective

predictions in scenarios with limited data. By

optimizing data preprocessing and selecting

appropriate features, accurate predictions can be

achieved even with limited data, which is particularly

important for emerging games or situations with

limited data collection. Future research should focus

on multisource data integration, real-time prediction

and intervention, long-term behavior analysis, and the

application of advanced machine learning and deep

learning algorithms to further enhance the

performance and adaptability of predictive models.

Through these optimizations, game developers can

more accurately predict player behavior, develop

more effective user retention strategies, and improve

game performance and user satisfaction.

REFERENCES

Wang, C., Rao, C., Hu, F., Xiao, X., & Goh, M., 2024. Risk

assessment of customer churn in telco using FCLCNN-

LSTM model. Expert Systems with Applications, 248,

123352. https://doi.org/10.1016/j.eswa.2024.123352

Ting, M. P., & Cho, D. M., 2024. What drives user churn in

serious games? An empirical examination of the TAM,

SOR theory, and game quality in Chinese cultural

heritage games. Entertainment Computing, 52, 100758.

https://doi.org/10.1016/j.entcom.2024.100758

Wang, J., 2018. Research on Online Game Player Churn

Prediction Based on COX Proportional Hazards Model

(Master's thesis, Jinan University).

Tu, M. H., 2020. Study on the Factors Influencing Mobile

Game User Churn (Master’s thesis, Shanghai

University of Finance and Economics). https://doi.org/

10.27296/d.cnki.gshcu.2020.002449

Zhang, W., 2021. Analysis on Online Game Player Churn

Prediction (Master’s thesis, Beijing University of

Technology). https://doi.org/10.26935/d.cnki.gbjgu.20

21.000617

Milošević, M., Živić, N., & Andjelković, I., 2017. Early

churn prediction with personalized targeting in mobile

social games. Expert Systems with Applications, 83,

326-332. https://doi.org/10.1016/j.eswa.2017.04.014

Bonometti, V., Ringer, C., Hall, M., Wade, A. R., &

Drachen, A., 2019. Modelling early user-game

interactions for joint estimation of survival time and

churn probability. In Proceedings of the IEEE

Conference on Games 2019. IEEE. https://doi.org/

10.1109/CoG.2019.8847993

Loria, E., & Marconi, A., 2021. Exploiting limited players’

behavioral data to predict churn in gamification.

Electronic Commerce Research and Applications, 47,

101057. https://doi.org/10.1016/j.elerap.2021.101057

Kang, H., Suh, C., & Kim, H. K., 2024. Match experiences

affect interest: Impacts of matchmaking and performance

on churn in a competitive game. Heliyon, 10(3), e24891.

https://doi.org/10.1016/j.heliyon.2024.e24891

Hossain, M. Y., Azizi, E., & Zaman, L., 2023. Predicting

subscription renewal using binary classification in

World of Warcraft. Entertainment Computing, 44,

100522. https://doi.org/10.1016/j.entcom.2023.100522

Research on Machine Learning Models for Predicting Player Churn

121