Predicting Employee Turnover Using Personality Assessment: A

Data-Driven Approach

Reynold Navarro Mazo

2,3 a

, Maur

ıcio Pereira Nogueira J

unior

Arthur Rodrigues Soares de Quadros

1,3 b

, Alessandro Vieira

3 c

and Wladmir Cardoso Brand

1,3 d

Department of Computer Science, Pontiﬁcal Catholic University of Minas Gerais (PUC Minas), Brazil

Institute of Mathematical and Computer Sciences (ICMC), University of S

ao Paulo (USP), S

ao Carlos, SP, Brazil

Data Science Laboratory (SOLAB), S

olides S.A., Belo Horizonte, MG, Brazil

Keywords:

Machine Learning (ML), Artiﬁcial Intelligence, Employee Turnover, Predictive Analytic, Human Resources

(HR), Behavioral Proﬁling, PACE Framework.

Abstract:

Employee turnover represents a persistent challenge for organizations seeking to maintain stability, retain insti-

tutional knowledge, and control costs. Traditional predictive models often rely on static employee records and

demographic variables, providing limited insight into the nuanced behavioral patterns that precede workforce

attrition. This study leverages the PACE Behavioral Proﬁle Mapping (BPM) framework to integrate behavioral

features into a machine learning–based turnover prediction pipeline. Clustering techniques were employed to

ensure model generalization for speciﬁc company clusters, and hyperparameter optimization was performed

using Optuna. The resultant CatBoost models demonstrated notable improvements in predicting turnover risk,

particularly for employees at higher risk of departure, when PACE-based behavioral indicators were incor-

porated. These ﬁndings suggest that a more comprehensive characterization of employee tendencies, beyond

conventional demographic and historical measures, can enhance the identiﬁcation of at-risk individuals. By

adopting behaviorally informed analytics, organizations may achieve more targeted and effective retention

strategies, ultimately supporting more stable workforce management.

1 INTRODUCTION

Employee turnover presents a persistent challenge for

organizations, inﬂuencing productivity, operational

continuity, and ﬁnancial outcomes. Understanding

and predicting turnover risk is crucial for Human Re-

source Management (HRM) to implement effective

retention strategies. Despite the growing availability

of employee data, predicting turnover remains com-

plex, as it often involves multifaceted behavioral and

contextual factors that are not easily quantiﬁable.

One approach that offers a structured basis for

categorizing and evaluating employee behaviors is

the PACE Behavioral Proﬁle Mapping (BPM) frame-

work (Vieira et al., 2023). By classifying individu-

als into four archetypes (Planner, Analyst, Communi-

cator, and Executor) this framework provides an em-

pirical foundation for examining complex behavioral

patterns within organizational contexts. Given its rel-

https://orcid.org/0009-0003-2011-3715

https://orcid.org/0009-0004-9593-7601

https://orcid.org/0000-0002-9921-3588

https://orcid.org/0000-0002-1523-1616

ative novelty, PACE may serve as an underexplored

reference point for systematically investigating sub-

tle interactions and adaptive responses, thereby con-

tributing to a more nuanced understanding of work-

place dynamics.

While PACE provides a theoretically comprehen-

sive framework, its practical effectiveness for predict-

ing turnover has yet to be rigorously evaluated. Be-

havioral proﬁling is inherently complex, and the ex-

tent to which archetypal classiﬁcations and situational

indicators – such as energy, morale, and ﬂexibility –

translate into actionable insights for turnover predic-

tion remains an open question.

This study investigates the integration of the

PACE Behavioral Proﬁle Mapping (BPM) framework

with machine learning techniques to predict employee

turnover using real-world data. By carefully engi-

neering features to address ethical concerns, grouping

companies based on shared turnover patterns for en-

hanced model generalization, and prioritizing recall

to effectively identify at-risk employees, this research

seeks to assess the PACE framework’s practical util-

ity and limitations. The ﬁndings aim to inform HR

practitioners and researchers on whether such behav-

582

Mazo, R. N., Nogueira Júnior, M. P., Soares de Quadros, A. R., Vieira, A. and Brandão, W. C.

Predicting Employee Turnover Using Personality Assessment: A Data-Driven Approach.

DOI: 10.5220/0013436400003929

In Proceedings of the 27th International Conference on Enterprise Information Systems (ICEIS 2025) - Volume 1, pages 582-592

ISBN: 978-989-758-749-8; ISSN: 2184-4992

ioral proﬁling approaches can complement existing

methods, ultimately supporting more proactive and

targeted retention strategies.

Our contributions in this paper are as follows:

• We propose a methodology for predicting em-

ployee turnover using behavioral proﬁling, inte-

grating the PACE framework with machine learn-

ing models.

• We explore clustering techniques to group compa-

nies based on turnover patterns, enhancing model

scalability and generalizability.

• We provide insights into the challenges of

turnover prediction and propose recommenda-

tions for improving data collection and model in-

terpretability in future applications.

2 BACKGROUND

In this section, we specify terms related to this study,

such as Employee Turnover, Machine Learning core

concepts, and the features utilized in our methodol-

ogy.

2.1 Employee Turnover

Employee turnover remains a critical concern for or-

ganizations worldwide due to its substantial impact

on ﬁnancial performance, operational efﬁciency, and

overall effectiveness (Hancock et al., 2013). High

turnover rates can lead to increased recruitment and

training costs, loss of organizational knowledge, and

decreased employee morale. Consequently, under-

standing the factors that inﬂuence employees’ inten-

tions to leave has become a paramount focus for aca-

demics and professionals in HRM. By exploring these

elements, organizations can develop effective strate-

gies to enhance employee retention, improve produc-

tivity, and maintain a competitive edge in their respec-

tive industries. However, to implement these strate-

gies successfully, it is crucial to identify potential

turnover risks early. Early identiﬁcation of employees

who may be considering departure allows organiza-

tions to proactively address underlying issues, tailor

retention initiatives, and ultimately mitigate the ad-

verse effects associated with high turnover rates.

2.2 Machine Learning

2.2.1 Supervised Machine Learning

Supervised learning is particularly suitable for

turnover classiﬁcation because it leverages labeled

data to predict outcomes. This works in contrast to

unsupervised machine learning, where it uses only the

input features X without corresponding labels Y and

is typically used for clustering or anomaly detection,

which is less applicable in this context. In scenarios

where labeled data is scarce, semi-supervised learn-

ing can be an alternative. This method uses a small

portion of labeled data along with a larger set of unla-

beled data to improve learning accuracy.

Our study focuses on supervised machine learning

methods, speciﬁcally utilizing algorithms like Ran-

dom Forest and Logistic Regression, which will be

detailed in subsequent sections. Typically, the dataset

is split into training and testing subsets, often in an 80-

20 ratio. The model is trained on 80% of the data and

tested on the remaining 20% to evaluate its perfor-

mance. The supervised learning approach is advanta-

geous when ample labeled data is available, allowing

the model to learn intricate patterns that distinguish

employees who are likely to leave from those who are

not. This method is also adaptable for making predic-

tions on new data, enabling organizations to identify

at-risk employees proactively.

2.2.2 Training and Testing

In the realm of supervised machine learning, the fun-

damental concepts of training and testing data are piv-

otal for developing predictive models. These models

learn from data pairs {X, Y }, where X represents the

input features – such as employee demographics, job

satisfaction scores, and performance metrics –, and Y

denotes the target variable, in this case, the turnover

status i.e., whether an employee stays or not.

The training dataset is employed to teach the

model’s underlying patterns and relationships be-

tween X and Y . Once trained, the model’s efﬁcacy is

evaluated using the testing dataset, which assesses its

ability to generalize and make accurate predictions on

new, unseen data. This process ensures that the model

is not merely memorizing the training data but is ca-

pable of predicting turnover intentions in a real-world

setting.

2.3 Features for Turnover Classiﬁcation

At the organizational level, subtle indicators related

to the composition of the workforce, the structure of

internal roles and units, the general patterns of staff

tenure, the degree to which systematic behavioral as-

sessments are embedded in the corporate routine, and

the cumulative record of prior separations collectively

offer a nuanced perspective on turnover dynamics.

Analyzing these interconnected signals allows for a

Predicting Employee Turnover Using Personality Assessment: A Data-Driven Approach

583

more comprehensive understanding of how organiza-

tional characteristics and embedded practices shape

both retention outcomes and the likelihood of depar-

tures.

The PACE BPM framework (Vieira et al., 2023)

offers a probabilistic approach to quantifying indi-

vidual behavioral tendencies through four distinct

archetypes: Planner, Analyst, Communicator, and

Executor. These archetypes provide a structured

framework for understanding employee behavior pat-

terns in response to workplace stimuli, which makes

them particularly useful for predicting turnover. Each

archetype embodies a unique response rhythm, high-

lighting speciﬁc personality traits and work prefer-

ences. For example, “Planners” prioritize stabil-

ity, meticulous preparation, and adherence to rules,

while “Executors” excel in fast-paced, dynamic envi-

ronments requiring decisive and energetic responses.

Meanwhile, “Communicators” thrive in collaborative,

socially engaging settings, and “Analysts” emphasize

precision, organization, and methodical approaches

within structured contexts.

This framework enables organizations to evaluate

not only the dominant behavioral traits of employ-

ees but also situational indicators such as energy lev-

els, morale, ﬂexibility, and motivation. Such granular

proﬁling provides valuable insights into how employ-

ees engage with their work environment and adapt to

varying demands. By incorporating these behavioral

insights as features together with the non-behavioral

features into machine learning models, organizations

could enhance their ability to predict turnover risks,

ensuring alignment between employees’ behavioral

tendencies and their roles within the workplace. This

integration facilitates more proactive and targeted re-

tention strategies, contributing to organizational sta-

bility and efﬁciency.

3 RELATED WORKS

Employee turnover remains a critical challenge for

organizations, signiﬁcantly affecting ﬁnancial perfor-

mance, operational stability, and long-term organiza-

tional growth. High turnover rates increase recruit-

ment and training costs, leading to the loss of institu-

tional knowledge, and lower employee morale. These

impacts are particularly pronounced in industries such

as manufacturing and services, where skilled labor is

integral to maintaining productivity. (Veglio et al.,

2024) highlight that turnover in multinational compa-

nies disrupts strategic continuity and operational ef-

ﬁciency, emphasizing the need for tailored retention

strategies informed by predictive analytics.

Artiﬁcial intelligence (AI) and advanced analytics

have become essential tools in addressing employee

turnover. (Gopinath and Appavu alias Balamurugan,

2024) explore the role of human resource analytics,

demonstrating how machine learning models can in-

tegrate behavioral, demographic, and organizational

data to predict turnover risk. Their study shows that

AI can provide actionable insights, enabling HR pro-

fessionals to implement targeted interventions for at-

risk employees. However, they also acknowledge

challenges such as the potential for data bias and the

need for transparent algorithms to maintain employee

trust.

(Mar

ın D

ıaz et al., 2023) examine the integration

of traditional employee metrics with behavioral data

for predicting turnover. Their work highlights the im-

portance of incorporating factors like job satisfaction

and career development opportunities into predictive

models. Similarly, (Morelli et al., 2024) investigates

clustering techniques to group employees based on

turnover propensity, enhancing the interpretability of

predictive models.

Moreover, probabilistic approaches to behavioral

proﬁling are gaining attention in turnover research.

(Gopinath and Appavu alias Balamurugan, 2024) em-

phasize that behavioral proﬁling frameworks, such as

those mapping archetypes or response rhythms, can

improve turnover predictions by integrating qualita-

tive insights with quantitative metrics. This com-

plements the ﬁndings of (Veglio et al., 2024), who

stress the importance of blending behavioral insights

with organizational data to address the complexities

of turnover dynamics in multi-national contexts.

Studies on turnover prediction show results are

highly dependent on the dataset used. (Park et al.,

2024) achieved 78.5% accuracy using XGBoost on

the Korean 2019 Graduate Occupation Mobility Sur-

vey. (Lim et al., 2024) achieved over 90% accu-

racy on the IBM Employee Attrition dataset using

a hybrid KNN-based model. (Chakraborty et al.,

2021) reached 90% with Random Forest and 80%

with Naive Bayes. (Al Akasheh et al., 2024) achieved

92.5% on the same IBM dataset using Knowledge

Convolutional Networks. Still on the IBM dataset,

(Yi

git and Shourabizadeh, 2017) used data mining

for over 80% average precision, and (Ozdemir et al.,

2020) used various methods for 75% precision.

To our knowledge, no studies directly use BPM

information for turnover prediction. (Tsaousoglou,

2021) argue that psychometric assessment is crucial

to ensure employees can learn and maintain job per-

formance. (Li et al., 2022) discuss how different pro-

ﬁles have varying motivations, requiring different re-

tention strategies. Similarly, (Emerson et al., 2023)

ICEIS 2025 - 27th International Conference on Enterprise Information Systems

584

show how psychometric assessments of stress and

burnout contribute to student dropout rates, suggest-

ing different proﬁles respond differently to incentives.

All mentioned studies collectively underscore the

value of AI-driven analytics in turnover prediction

while highlighting key challenges, including data

quality, ethical concerns, and model interpretability.

This paper builds on prior work by integrating behav-

ioral proﬁling through the PACE framework with ma-

chine learning models, prioritizing recall to enhance

the identiﬁcation of at-risk employees. By addressing

these challenges, this study contributes to the devel-

opment of scalable, actionable, and ethical solutions

for managing employee turnover.

4 METHODOLOGY

This study investigates the application of machine

learning techniques for employee turnover predic-

tion using the PACE behavioral proﬁling framework.

The methodology aims to balance predictive accu-

racy, ethical considerations, and practical applicabil-

ity, while addressing challenges related to data vari-

ability across different organizations and ensuring

generalizability across a wide range of enterprise con-

texts.

The methodology is outlined in Figure 1, which

illustrates the key steps that guided the experimental

process. Each step in the ﬂowchart represents a crit-

ical phase in the development and evaluation of the

proposed approach, ensuring a systematic and struc-

tured workﬂow.

4.1 Modeling Strategy

4.1.1 Feature Engineering and Selection

Feature engineering was guided by both practical and

ethical considerations. In particular, features with

potential ethical implications, such as those related

to demographic attributes or hierarchical status, were

purposefully excluded to maintain fairness and com-

pliance with regulatory standards. Instead, derived

features concentrated on organizational and behav-

ioral metrics, including the proportional relationship

between an individual’s tenure and the average tenure

within the company, the density of turnover occur-

rences over speciﬁed intervals, and aspects derived

from the PACE framework. By favoring these con-

structs, the feature set aimed to retain predictive util-

ity while mitigating biases and ethical concerns.

Correlation analysis was performed to identify

and remove multicollinear features. Additionally,

Figure 1: Methodology Flowchart. This diagram represents

the steps and processes used in the proposed methodology.

feature importance was evaluated using preliminary

model tests, with preference given to interpretable

features.

Predicting Employee Turnover Using Personality Assessment: A Data-Driven Approach

585

4.1.2 Clustering Strategy

Initially, turnover prediction models were developed

for individual companies or sectors, requiring a min-

imum volume of data to enable training, as shown

in the study (Adeusi et al., 2024), which applies ma-

chine learning techniques to predict turnover in high-

stress sectors. However, this approach proved imprac-

tical for smaller companies due to insufﬁcient data

and the recurring cost of tailoring models to each

new organization. To overcome these limitations,

a uniﬁed modeling approach was explored, group-

ing companies with shared characteristics to main-

tain individual-level relevance while reducing mod-

eling costs.

The grouping was primarily driven by the

score company variable, which represents a turnover

density rate adjusted for time, reﬂecting company-

speciﬁc turnover characteristics. This approach aimed

to achieve meaningful predictions, especially in con-

texts where the goal is proactive retention interven-

tions, even if this means prioritizing recall over accu-

racy to capture a larger number of potential turnover

cases.

The clustering strategy aimed to identify more

uniform training subsets by grouping company’s

based on a combination of organizational-level met-

rics. Initially, a single metric served as a foundation,

score company, however, more balanced and repre-

sentative groupings emerged when additional features

were considered in conjunction. Integrating a broader

range of features—encompassing organizational vol-

ume, indicators of workforce separations, tempo-

ral retention measures, and the original foundational

score

company. To determine the optimal number of

clusters, the Elbow Method was employed, allowing

the selection of an appropriate partitioning threshold

that minimized within-cluster variance while avoid-

ing overﬁtting. The ﬁnal clustering was performed

using the k-means algorithm, which efﬁciently parti-

tioned the data into meaningful, internally cohesive

groups, thereby facilitating more robust and general-

izable turnover prediction.

4.1.3 Machine Learning Models

Various ML models were tested, including neural net-

works as Long Short-Term Memory (LSTM) based,

Support Vector Machine (SVM), Random Forest,

XGBoost, and CatBoost. The following strategies

were implemented:

• Hyperparameter Tuning. Optuna (Akiba et al.,

2019) was used to optimize model hyperparame-

ters.

• Data Imbalance Management. Oversam-

pling technique Synthetic Minority Oversampling

Technique (SMOTE) (Chawla et al., 2011) were

combined with manually tuned class weights to

address imbalances and ensure meaningful pre-

dictions without overly passive or alarmist results.

4.2 Evaluation Metrics

Predictive performance was evaluated using the recall

metric, which prioritizes the identiﬁcation of true pos-

itives over precision and overall accuracy. Recall is

calculated as follows:

Recall =

True Positives (TP)

True Positives (TP) + False Negatives (FN)

This approach ensures the model captures the

maximum number of employees at risk of turnover,

aligning with the goal of enabling proactive retention

strategies. While precision and overall accuracy were

considered, the trade-off with recall was deemed ap-

propriate for the speciﬁc application. Precision is cal-

culated as follows:

Prec. =

True Positives (TP)

True Positives (TP) + False Positives (FP)

Additionally, the F1-score, which balances preci-

sion and recall, was also considered. It is deﬁned as

the harmonic mean of precision and recall:

F1 = 2 ×

Precision × Recall

Precision + Recall

Accuracy, which measures the proportion of cor-

rectly predicted instances, is given by:

Acc. =

True Positives (TP) + True Negatives (TN)

Total Predictions

Metrics were examined at the aggregate level for

both experimental setups, incorporating class balanc-

ing and oversampling techniques. Following this ini-

tial assessment, the best-performing model was sub-

jected to hyperparameter optimization (Akiba et al.,

2019). Subsequently, metrics were re-evaluated to

compare the model’s performance when utilizing the

complete set of features against the conﬁguration ex-

cluding the PACE framework features.

5 RESULTS

5.1 Clustering Analysis

To improve the predictive accuracy of turnover mod-

els, a clustering strategy was implemented to group

ICEIS 2025 - 27th International Conference on Enterprise Information Systems

586

companies based on shared characteristics derived

from key turnover metrics. The Elbow Method was

used to determine the optimal number of clusters, bal-

ancing model complexity and performance. As shown

in the Elbow Curve (Figure 2), the point of inﬂection

occurs at k = 6, suggesting that six clusters provide an

appropriate representation of the data while avoiding

overﬁtting.

Each cluster was analyzed individually, with sepa-

rate predictive models evaluated within these groups.

This clustering strategy allowed the models to ac-

count for inter-company variations in turnover behav-

ior and better adapt to speciﬁc organizational con-

texts. By tailoring the prediction models to clusters,

the following advantages were observed:

1. Increased Recall. Models trained within clusters

demonstrated higher recall for the minority class

(class 1), indicating better identiﬁcation of em-

ployees at high turnover risk.

2. Balanced Metrics. Accuracy and F1-scores were

more evenly distributed across clusters, reﬂect-

ing improved generalization compared to a single,

non-clustered model.

3. Interpretable Results. Clustering enabled a bet-

ter understanding of company-speciﬁc patterns,

such as turnover density, regional impacts, and

operational constraints.

The resulting performance metrics for the mod-

els (LSTM, SVM, XGBoost, and CatBoost) were

summarized across all clusters, detailing precision,

recall, and F1-scores for each class. This demon-

strates the efﬁcacy of leveraging clustering as a pre-

processing step to improve the predictive power of

machine learning models in turnover prediction tasks.

Moreover, the tailored approach ensures actionable

insights for companies with varying sizes and oper-

ational characteristics.

5.2 Experimentation with Weight

Balancing

Weight balancing was employed to address the inher-

ent class imbalance in turnover prediction, where the

minority class (employees likely to leave) is often un-

derrepresented. This approach aimed to enhance re-

call for the minority class (class 1) without signiﬁ-

cantly compromising precision for the majority class

(class 0). Table 1 summarizes the performance met-

rics for LSTM, SVM, XGBoost, and CatBoost mod-

els across all clusters when weight balancing was ap-

plied.

The results demonstrate that weight balancing im-

proved the recall and F1-scores for class 1 in all mod-

Figure 2: Elbow Method for KMeans Clustering.

els, particularly in clusters with higher turnover den-

sity. This adjustment ensures the models can bet-

ter identify employees at risk of leaving, supporting

proactive retention strategies while maintaining ade-

quate performance for class 0. The experimentation

highlights the utility of weight balancing in mitigat-

ing class imbalance challenges in turnover prediction.

Among the models evaluated, CatBoost consis-

tently achieved the best overall performance across

clusters, with high precision and recall for both

classes, balancing predictive accuracy and generaliza-

tion effectively

5.3 Experimentation with SMOTE

SMOTE (Chawla et al., 2011) was applied to address

class imbalance by generating synthetic samples for

the minority class (employees likely to leave). The

primary objective of using SMOTE was to enhance

recall for class 1 while maintaining balanced perfor-

mance across other metrics. Table 2 presents the per-

formance metrics for LSTM, SVM, XGBoost, and

CatBoost models across all clusters when SMOTE

was employed.

Contrary to expectations, the use of SMOTE did

not result in consistent improvements in recall for

class 1. While some models, such as CatBoost, main-

tained relatively stable performance across clusters,

the recall for class 1 generally decreased compared

to models trained without SMOTE. For instance, Cat-

Boost achieved a maximum recall of only 42% for

class 1 in cluster 4, whereas recall for other models,

such as XGBoost, dropped to as low as 24% in certain

clusters. These results indicate that SMOTE’s over-

sampling may have introduced noise or distorted the

feature space, leading to suboptimal performance for

the minority class.

Predicting Employee Turnover Using Personality Assessment: A Data-Driven Approach

587

Table 1: Combined Performance Metrics Across Clusters for Turnover Prediction using Weight Balance.

No. Model Cluster Class Accuracy (%) Precision (%) Recall (%) F1-Score (%)

1 LSTM

0 62.00 80.00 61.00 69.00

1 39.00 62.00 48.00

0 59.00 91.00 57.00 70.00

1 23.00 69.00 34.00

0 63.00 88.00 64.00 74.00

1 28.00 62.00 39.00

0 61.00 84.00 61.00 71.00

1 31.00 59.00 41.00

0 59.00 86.00 59.00 70.00

1 28.00 61.00 38.00

0 62.00 91.00 61.00 73.00

1 23.00 67.00 34.00

2 SVM

0 58.00 81.00 54.00 65.00

1 37.00 68.00 48.00

0 61.00 91.00 60.00 72.00

1 23.00 67.00 35.00

0 61.00 88.00 61.00 72.00

1 28.00 64.00 39.00

0 63.00 85.00 63.00 72.00

1 33.00 63.00 43.00

0 58.00 87.00 56.00 68.00

1 27.00 66.00 39.00

0 63.00 91.00 62.00 74.00

1 23.00 66.00 34.00

3 XGBoost

0 64.00 78.00 69.00 73.00

1 40.00 51.00 45.00

0 84.00 89.00 92.00 91.00

1 48.00 40.00 43.00

0 77.00 86.00 86.00 86.00

1 40.00 41.00 41.00

0 78.00 85.00 87.00 86.00

1 51.00 48.00 50.00

0 73.00 85.00 80.00 82.00

1 36.00 45.00 40.00

0 81.00 89.00 89.00 89.00

1 37.00 37.00 37.00

4 CatBoost

0 63.00 81.00 62.00 70.00

1 40.00 63.00 49.00

0 70.00 93.00 70.00 80.00

1 30.00 71.00 42.00

0 72.00 89.00 76.00 82.00

1 36.00 58.00 44.00

0 72.00 86.00 76.00 81.00

1 42.00 58.00 49.00

0 68.00 87.00 70.00 77.00

1 33.00 59.00 42.00

0 73.00 91.00 76.00 83.00

1 29.00 57.00 39.00

Although SMOTE did not meet its intended goal

of improving recall for class 1, it slightly increased

precision and F1-scores for class 0 in most mod-

els, indicating better representation of the majority

class. This highlights a potential trade-off between

oversampling and predictive accuracy, emphasizing

the need for tailored strategies to handle imbalanced

datasets in turnover prediction.

Our experiments showed that applying SMOTE

to address class imbalance did not improve perfor-

mance. This is likely due to the complex structure

of the turnover dataset. The minority class (employ-

ees likely to leave) overlaps signiﬁcantly with the ma-

jority class. SMOTE’s synthetic samples did not ac-

ICEIS 2025 - 27th International Conference on Enterprise Information Systems

588

curately capture the data distribution, adding noise

and increasing overﬁtting. The high dimensionality

of our features also complicated SMOTE’s interpo-

lation, reducing the discriminative power of the aug-

mented data. These ﬁndings suggest weight balancing

may be more effective than SMOTE for this dataset’s

class imbalance.

5.4 Best Model Selection: CatBoost

with Class Weights and

Hyperparameter Tuning

Based on the results of previous experiments, the Cat-

Boost model with class weight balancing emerged as

the most suitable for turnover prediction. This model

demonstrated consistent performance across clusters,

excelling in recall for the minority class (class 1)

while maintaining robust metrics for the majority

class (class 0). To further enhance its predictive ca-

pability, hyperparameter tuning was performed using

the Optuna framework, which systematically searches

for the optimal combination of parameters to maxi-

mize performance.

5.4.1 Hyperparameter Tuning with Optuna

Optuna (Akiba et al., 2019) is a state-of-the-art frame-

work for automated hyperparameter optimization.

The tuning process focused on improving recall for

class 1, a critical metric for turnover prediction. Key

parameters optimized during the process included:

• Learning Rate. Controlled the step size at each

iteration of model training.

• Depth. Deﬁned the maximum depth of the de-

cision trees in the model, impacting its ability to

capture complex patterns.

• L2 Regularization. Mitigated overﬁtting by pe-

nalizing large weights, ensuring generalization.

• Bagging Temperature. Adjusted the variability

in data subsampling during training to balance di-

versity and stability.

The optimization objective prioritized maximiz-

ing recall for class 1 while ensuring balanced perfor-

mance across other metrics.

5.4.2 Performance Metrics of the Best Model

The ﬁnal performance metrics of the CatBoost model,

presented in Table 3, underscore the model’s capac-

ity to handle turnover prediction and lend support

to the hypothesis that incorporating the PACE be-

havioral features enhances predictive performance.

When compared to the conﬁguration excluding PACE

features, the full feature set consistently achieved

higher accuracy and improved F1-scores for the posi-

tive (turnover) class across all examined clusters. This

pattern suggests that behavioral indicators captured

by the PACE framework provide actionable informa-

tion that augments traditional organizational metrics,

ultimately contributing to more robust and nuanced

turnover predictions.

Key observations for the best model when utiliz-

ing the complete feature set include:

• Class 0 (Majority Class). The model consis-

tently achieved high precision and recall across

all clusters, with F1-scores ranging from approxi-

mately 75% to 81%.

• Class 1 (Minority Class). Recall values im-

proved notably, reaching up to 76% in certain

clusters, while F1-scores varied between 42% and

56%. These results indicate a substantive increase

in the detection of actual turnover cases.

These ﬁndings emphasize the potential value of

incorporating behavioral features derived from pro-

ﬁling frameworks like PACE into turnover predic-

tion efforts. The CatBoost model, especially after

hyperparameter tuning, performed better when these

PACE-based inputs were included, underscoring their

added value over conﬁgurations relying solely on tra-

ditional organizational metrics. Although overall im-

provements in minority class detection were moder-

ate, the inclusion of behavioral indicators contributed

to more nuanced insights and enhanced recall rates,

ultimately supporting more informed and targeted re-

tention strategies.

5.4.3 Insights and Implications

The CatBoost model, with its tuned parameters and

the integration of behavioral features derived from the

PACE framework, not only achieved robust perfor-

mance for the majority class but also demonstrated

improved recall for the minority class. This balanced

performance is particularly valuable for actionable

turnover predictions, as it enables HR professionals

to more accurately identify at-risk employees while

preserving precision for those likely to remain.

The experiment highlights the importance of com-

bining class weighting strategies, advanced hyperpa-

rameter optimization frameworks such as Optuna, and

the incorporation of behavioral proﬁling features to

achieve superior results in complex predictive tasks.

These ﬁndings underscore the potential of tailored,

behaviorally-informed machine learning approaches

for addressing organizational challenges, ensuring

that models remain both interpretable and actionable

for HR decision-making.

Predicting Employee Turnover Using Personality Assessment: A Data-Driven Approach

589

Table 2: Combined Performance Metrics Across Clusters for Turnover Prediction using SMOTE.

No. Model Cluster Class Accuracy (%) Precision (%) Recall (%) F1-Score (%)

1 LSTM

0 58.00 79.00 58.00 66.00

1 36.00 60.00 45.00

0 70.00 88.00 76.00 81.00

1 23.00 40.00 29.00

0 65.00 87.00 67.00 75.00

1 28.00 56.00 37.00

0 62.00 83.00 65.00 73.00

1 31.00 54.00 39.00

0 60.00 84.00 62.00 71.00

1 27.00 55.00 36.00

0 69.00 89.00 72.00 80.00

1 24.00 49.00 32.00

2 SVM

0 62.00 78.00 66.00 71.00

1 38.00 53.00 44.00

0 68.00 88.00 72.00 79.00

1 23.00 46.00 30.00

0 66.00 86.00 70.00 77.00

1 28.00 50.00 36.00

0 66.00 83.00 70.00 76.00

1 33.00 50.00 40.00

0 66.00 84.00 70.00 77.00

1 29.00 48.00 36.00

0 71.00 90.00 74.00 81.00

1 25.00 50.00 34.00

3 XGBoost

0 69.00 75.00 84.00 79.00

1 43.00 31.00 36.00

0 83.00 89.00 92.00 90.00

1 45.00 34.00 39.00

0 80.00 84.00 93.00 88.00

1 43.00 24.00 31.00

0 75.00 83.00 85.00 84.00

1 44.00 40.00 42.00

0 78.00 83.00 92.00 87.00

1 44.00 24.00 31.00

0 83.00 88.00 93.00 90.00

1 40.00 27.00 32.00

4 CatBoost

0 72.00 75.00 90.00 82.00

1 51.00 25.00 34.00

0 83.00 88.00 93.00 90.00

1 43.00 29.00 34.00

0 81.00 84.00 94.00 89.00

1 49.00 23.00 32.00

0 78.00 84.00 88.00 86.00

1 51.00 42.00 46.00

0 80.00 82.00 95.00 88.00

1 52.00 20.00 29.00

0 84.00 88.00 94.00 91.00

1 43.00 27.00 33.00

5.5 Discussions About PACE

5.5.1 Limitations for Turnover Prediction

While the PACE framework effectively assesses indi-

vidual responses in speciﬁc situations, Proﬁler reports

alone cannot directly measure performance. This lim-

itation can be addressed by analyzing changes in pre-

vious test results, evaluating environmental demands,

and measuring individual adaptation. Poor adaptation

could suggest performance issues.

Since PACE is a self-report tool, accurate self-

assessment is crucial. If an individual cannot accu-

rately assess environmental demands or their own per-

sonality traits, the resulting data may not reﬂect true

performance, leading to inaccurate conclusions.

ICEIS 2025 - 27th International Conference on Enterprise Information Systems

590

Table 3: Performance Metrics Across Clusters for Turnover Prediction Best Model CatBoost.

No. Model Cluster Class Accuracy (%) Precision (%) Recall (%) F1-Score (%)

1 All features - CatBoost

0 68.00 86.00 66.00 75.00

1 46.00 72.00 56.00

0 71.00 96.00 69.00 80.00

1 33.00 84.00 47.00

0 73.00 92.00 73.00 81.00

1 38.00 73.00 50.00

0 71.00 89.00 71.00 79.00

1 41.00 69.00 52.00

0 72.00 92.00 71.00 80.00

1 39.00 76.00 52.00

0 72.00 93.00 72.00 81.00

1 30.00 70.00 42.00

2 Non PACE Features-CatBoost

0 65.00 84.00 63.00 72.00

1 43.00 71.00 53.00

0 67.00 96.00 63.00 76.00

1 30.00 87.00 44.00

0 69.00 92.00 68.00 78.00

1 35.00 75.00 48.00

0 67.00 85.00 70.00 77.00

1 36.00 58.00 45.00

0 69.00 91.00 68.00 78.00

1 37.00 74.00 49.00

0 70.00 93.00 70.00 80.00

1 29.00 69.00 41.00

5.5.2 PACE-Based Retention Strategies

HR can use PACE results to develop targeted reten-

tion strategies. Proﬁler results reveal how individ-

uals respond to speciﬁc situations, allowing HR to

place employees in preferred environments and mini-

mize conﬂict. For example, assigning an Analyst to a

communication-focused role could cause stress, forc-

ing them into undesired situations. While such as-

signments might be necessary at times, Proﬁler data

helps HR to avoid unsuitable reassignments that could

lead to employee attrition.

6 CONCLUSIONS AND FUTURE

WORKS

One of the key challenges and considerations in

turnover prediction is addressing ethical concerns. To

mitigate the risk of biased predictions, features with

potential ethical implications were excluded. For in-

stance, while state-level data (UF) was initially con-

sidered, it ultimately proved both resource-intensive

and inconsistent to collect, resulting in its removal.

Another challenge lies in the data limitations

themselves. A signiﬁcant hindrance to model devel-

opment was the incomplete and inconsistent logging

of employee turnover data by organizations. Encour-

aging companies to maintain continuous and accurate

data recording practices could substantially enhance

the quality of future predictive models. By improving

data reliability, these efforts would not only reﬁne the

accuracy and robustness of turnover predictions but

also facilitate their broader application across varied

organizational settings.

Turnover prediction is inherently complex due to

the multifaceted and evolving factors inﬂuencing em-

ployee decisions. Evaluating the effectiveness of such

models poses a particular challenge: successful reten-

tion strategies may, by design, lower turnover rates,

thus diminishing the apparent predictive accuracy.

To address this, subsequent rounds of proﬁling af-

ter retention interventions can provide a more reliable

gauge of their true impact. Moreover, presenting re-

sults as probability intervals rather than categorical

outcomes facilitates a more consultative, analytics-

driven approach, allowing HR professionals to inter-

pret these predictions as indicative trends rather than

deﬁnitive forecasts.

For future improvements, the focus should be on

enhancing the predictive capabilities of turnover mod-

els by integrating temporal data and expanding the

range of features. Incorporating time-based informa-

tion can reveal critical patterns and trends preceding

turnover events. Key avenues for improvement in-

clude:

• Temporal Data Integration. Incorporating time-

series data such as changes in employee perfor-

Predicting Employee Turnover Using Personality Assessment: A Data-Driven Approach

591

mance metrics, tenure progression, and ﬂuctua-

tions in workload can provide deeper insights into

the dynamics inﬂuencing turnover.

• Additional Features. Introducing new variables

like employee engagement scores, satisfaction

surveys, training participation, and career devel-

opment opportunities can enrich the dataset and

enhance model accuracy.

• Longitudinal Analysis. Employing longitudinal

studies to track employee behavior over extended

periods may uncover latent factors contributing to

turnover, enabling more proactive interventions.

• External Factors. Including external data such as

economic indicators, industry trends, and regional

employment rates can help contextualize turnover

patterns within the broader market environment.

By expanding the feature set and integrating tem-

poral aspects, future models can achieve improved

generalization and predictive performance. These

enhancements would allow organizations to iden-

tify at-risk employees more accurately and imple-

ment targeted retention strategies, ultimately reducing

turnover rates and associated costs.

REFERENCES

Adeusi, K., Amajuoyi, P., and Benjami, L. (2024). Utiliz-

ing machine learning to predict employee turnover in

high-stress sectors. International Journal of Manage-

ment & Entrepreneurship Research, 6:1702–1732.

Akiba, T., Sano, S., Yanase, T., Ohta, T., and Koyama, M.

(2019). Optuna: A Next-generation Hyperparameter

Optimization Framework.

Al Akasheh, M., Hujran, O., Malik, E. F., and Zaki,

N. (2024). Enhancing the prediction of employee

turnover with knowledge graphs and explainable ai.

IEEE Access.

Chakraborty, R., Mridha, K., Shaw, R. N., and Ghosh, A.

(2021). Study and prediction analysis of the employee

turnover using machine learning approaches. In 2021

IEEE 4th International Conference on Computing,

Power and Communication Technologies (GUCON),

pages 1–6. IEEE.

Chawla, N. V., Bowyer, K. W., Hall, L. O., and Kegelmeyer,

W. P. (2011). SMOTE: Synthetic Minority Over-

sampling Technique.

Emerson, D. J., Hair Jr, J. F., and Smith, K. J. (2023).

Psychological distress, burnout, and business student

turnover: The role of resilience as a coping mecha-

nism. Research in higher education, 64(2):228–259.

Gopinath, K. and Appavu alias Balamurugan, S. (2024).

Human Resource Analytics: Leveraging Machine

Learning for Employee Attrition Prediction. pages

137–158.

Hancock, J. I., Allen, D. G., Bosco, F. A., McDaniel, M.,

and Pierce, C. A. (2013). Meta-analytic review of em-

ployee turnover as a predictor of ﬁrm performance.

Journal of Management, 39(3):573–603.

Li, H., Yuan, B., Yu, Y., Li, J., and Meng, Q. (2022).

Work motivation of primary health workers in china:

the translation of a measurement scale and its corre-

lation with turnover intention. Risk Management and

Healthcare Policy, pages 1369–1381.

Lim, C. S., Malik, E. F., Khaw, K. W., Alnoor, A., Chew, X.,

Chong, Z. L., and Al Akasheh, M. (2024). Hybrid ga–

deepautoencoder–knn model for employee turnover

prediction. Statistics, Optimization & Information

Computing, 12(1):75–90.

Mar

ın D

ıaz, G., Gal

an Hern

andez, J. J., and Gald

on Sal-

vador, J. L. (2023). Analyzing Employee Attrition

Using Explainable AI for Strategic HR Decision-

Making. 11(22):4677.

Morelli, C., Fusai, G., and Zenti, R. (2024). Who and

why will leave me? Utilizing Machine Learning-

Based Models to Anticipate and Manage Employee

Turnover. (4744130).

Ozdemir, F., Coskun, M., Gezer, C., and Gungor, V. C.

(2020). Assessing employee attrition using classiﬁ-

cations algorithms. In Proceedings of the 2020 the 4th

International Conference on Information System and

Data Mining, ICISDM ’20, page 118–122, New York,

NY, USA. Association for Computing Machinery.

Park, J., Feng, Y., and Jeong, S.-P. (2024). Developing an

advanced prediction model for new employee turnover

intention utilizing machine learning techniques. Sci-

entiﬁc Reports, 14(1):1221.

Tsaousoglou, K. (2021). Using psychometrics in

Human Resource Management in tourist accom-

modation: Employees’ predisposition for organi-

zational turnover. PhD thesis, Πανεπιστ

ηµιo

Πατρ

ων. Σχoλ

η ∆ιo

ικησης και Oικoνoµ

ιας.

Tµ

ηµα ∆ιo

ικησης . . . .

Veglio, V., Romanello, R., and Pedersen, T. (2024). Em-

ployee turnover in multinational corporations: A su-

pervised machine learning approach.

Vieira, A. G., de Jesus, A. C. C., da Assunc¸

ao Almeida de

Lima, J., Mariano, R. V. R., de Castro, G. Z., and

Brand

ao, W. C. (2023). A probabilistic mapping ap-

proach to assess the employee behavior proﬁle. pages

1–8. ISSN: 2378-1971.

git,

I. O. and Shourabizadeh, H. (2017). An approach

for predicting employee churn by using data mining.

In 2017 International Artiﬁcial Intelligence and Data

Processing Symposium (IDAP), pages 1–4.

ICEIS 2025 - 27th International Conference on Enterprise Information Systems

592