Predicting Purchase Frequency in e-Commerce: Hybrid Machine

Learning Approach

Nilay

Is¸eri, Mustafa Keskin and Onur Arda Ras¸tak

Hepsiburada, Turkey

Keywords:

e-Commerce, Purchase Frequency Prediction, Customer Behavior, Machine Learning, Multi Stage Prediction.

Abstract:

This paper addresses the problem of predicting customer purchase frequency. We developed machine learning

models to forecast the number of purchases a user will make next month, categorizing them into three classes.

We compared multiclass classiﬁcation, regression, and hybrid approaches. Our analysis shows that the most

effective method is a hybrid approach that uses a binary classiﬁer to target the 4+ purchases and a regression

model for the remaining classes. This two-stage model provided a signiﬁcant performance increase over single

models, proving to be a robust solution for imbalanced, ordinal prediction tasks.

1 INTRODUCTION

The rapid growth of electronic commerce in recent

years has led to a massive increase in online con-

sumer activity. This surge in e-commerce usage has

produced vast amounts of customer behavior data and

intensiﬁed competition among online retailers. In this

context, the ability to predict a customer’s future pur-

chasing behavior in particular, the number of pur-

chases they will make in the upcoming month has be-

come increasingly valuable. Accurate monthly pur-

chase frequency prediction can support a range of

business objectives, including personalized marketing

(e.g., targeting high-frequency shoppers with loyalty

rewards), inventory and supply chain optimization

(forecasting product demand at the customer-segment

level), and customer lifetime value (CLV) estimation

(Oblander et al., 2020). Furthermore, effective pre-

diction of individual purchasing frequency enables

ﬁrms to proactively identify high-value customers and

deploy retention strategies before those customers

churn, thereby improving proﬁtability (Verbeke et al.,

2014).

Predicting customer purchase frequency at the

individual level poses several challenges. Unlike

subscription services, e-commerce customer relation-

ships are typically non-contractual, meaning cus-

tomers can make purchases at irregular intervals or

stop purchasing at any time without notice. This ir-

regularity leads to highly skewed purchase frequency

distributions many customers make few or no pur-

chases in a given period, while a small segment gen-

erates many orders. Traditional techniques from mar-

keting science, such as RFM (Recency, Frequency,

Monetary) scoring and probabilistic models like the

Pareto/NBD and BG/NBD formulations, provide a

foundation for understanding purchase frequency and

customer value (Fader et al., 2005). However, these

classical models rely on strong statistical assump-

tions (e.g., Poisson purchase timing and exponential

dropout processes) and often struggle to incorporate

the rich features now available (such as detailed on-

line browsing behavior, product category preferences,

and multi-channel marketing touchpoints). Moreover,

in modern e-commerce datasets with millions of cus-

tomers and heterogeneous behaviors, such paramet-

ric models can face scalability and accuracy limi-

tations (Abe, 2009). These limitations have moti-

vated a shift toward data-driven machine learning ap-

proaches, such as gradient boosting models, which

can ﬂexibly learn purchasing patterns from large-

scale behavioral data without relying on predeﬁned

stochastic assumptions (Wang et al., 2023).

In light of the above, this paper focuses on devel-

oping a predictive model for monthly customer pur-

chase frequency using boosting-based machine learn-

ing methods. The motivation for using gradient boost-

ing is twofold: ﬁrst, their proven accuracy and ﬂex-

ibility in similar classiﬁcation/regression tasks sug-

gest they can effectively capture the subtle factors

that drive repeat purchases; second, boosting models

provide feature importance metrics and explainabil-

ity tools (e.g., SHAP values) that help interpret which

customer attributes most strongly inﬂuence purchase

I¸seri, N., Keskin, M. and Ra¸stak, O. A.

Predicting Purchase Frequency in e-Commerce: Hybrid Machine Learning Approach.

DOI: 10.5220/0014305900004848

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 2nd International Conference on Advances in Electrical, Electronics, Energy, and Computer Sciences (ICEEECS 2025), pages 64-68

ISBN: 978-989-758-783-2

frequency, offering business insights in addition to

predictions.

2 RELATED WORKS

The availability of large-scale behavioral data and

advances in computational techniques have enabled

machine learning methods such as gradient boost-

ing to model user purchase behavior without relying

on restrictive parametric assumptions (Wang et al.,

2023). Algorithms such as logistic regression, ran-

dom forests, and gradient boosting have been applied

to predict outcomes ranging from repurchase propen-

sity to purchase frequency or spending.

Ensemble methods, particularly gradient boosting

trees, have emerged as the most effective for these

tasks. Song and Liu demonstrated that XGBoost im-

proves purchase prediction in e-commerce by incor-

porating diverse behavioral features (Song and Liu,

2020).Yang et al. extended this direction by combin-

ing Random Forest and LightGBM in a hybrid en-

semble to address class imbalance and enhance re-

purchase prediction (Yang et al., 2021). These stud-

ies show that ensemble approaches can surpass tra-

ditional RFM and probabilistic models by leveraging

richer predictors and nonlinear interactions. Feature

engineering remains central, with extensions of RFM

to include variables such as customer tenure, inter-

purchase intervals, and engagement metrics further

improving predictive power. Overall, machine learn-

ing, especially ensemble learning, has become a cor-

nerstone in purchase frequency prediction.

Wang et al. proposed a user purchase behav-

ior prediction model based on XGBoost, leveraging

multi-dimensional behavioral features such as histor-

ical transaction patterns, account activity metrics, and

user segmentation tags to accurately forecast future

purchasing behavior (Wang et al., 2023). Building

on this direction, Sun et al. applied gradient boost-

ing decision trees (GBDT) and random forests to

predict key components of customer lifetime value

(CLV), particularly purchase frequency, and showed

that these models outperformed classical probabilistic

approaches such as Pareto/NBD and Pareto/GGG on

real-world retail datasets (Sun et al., 2021). In practi-

cal settings, CLV prediction is often decomposed into

sub-tasks, churn probability, expected frequency, and

average value, with boosting applied to each.

Overall, gradient boosting methods stand out for

their accuracy, ﬂexibility, and ability to incorporate

diverse features in frequency prediction. They con-

sistently outperform traditional models, though chal-

lenges remain in interpretability and integration with

newer techniques.

While gradient boosting dominates structured re-

tail data, newer methods are emerging to capture se-

quential dynamics and improve interpretability. Deep

learning models, such as LSTMs and transformers,

have been applied to customer transaction histories,

treating them as sequences of events. For exam-

ple, attention-based LSTMs have been used to pre-

dict high-value customer behavior, while transformer

architectures have shown advantages when long and

complex purchase cycles are involved (Lathwal and

Batra, 2024; Kim et al., 2023).

Hybrid approaches combine machine learning

with probabilistic or domain-speciﬁc modeling to bal-

ance ﬂexibility and structure. Examples include two-

stage models where neural networks predict distribu-

tional parameters for purchase counts, later reﬁned

with boosting, or reinforcement learning frameworks

that go beyond prediction to optimize marketing in-

terventions.

In sum, recent work explores deep learning for

sequential behavior, hybrid models to integrate prior

knowledge, and XAI methods for interpretability. Yet

ensemble trees, particularly boosting algorithms, re-

main the strongest baseline for purchase frequency

prediction, balancing accuracy, scalability, and trans-

parency (Grinsztajn et al., 2022). These developments

provide the foundation for our boosting-based frame-

work for next-month purchase frequency prediction.

3 DATA AND PREPROCESSING

3.1 Dataset Description

The dataset was constructed from a one-year trans-

actional history preceding the prediction month. The

target was deﬁned as the purchase frequency in June

2025, categorized into three discrete classes: (i) one

purchase, (ii) two to three purchases, and (iii) four or

more purchases (heavy buyers). The cohort included

only active customers with at least one completed or-

der in the historical period. The ﬁnal dataset con-

tained several million rows, with class distributions

reﬂecting the natural skew typically observed in e-

commerce frequency prediction tasks.

3.2 Feature Groups and Selection

Features originated from a customer-level datamart

that aggregates behavioral, transactional, and demo-

graphic signals, accumulating nearly 600 raw at-

tributes per member. For clarity, these features can

be described in the following categories:

Predicting Purchase Frequency in e-Commerce: Hybrid Machine Learning Approach

• Historical behavior: long-term purchase fre-

quency, cumulative monetary value, product di-

versity.

• Short-term recency and intensity: recency of

last purchase, recent order counts, temporal trend

ﬂags.

• Monetary indicators: average basket size, cu-

mulative spending, return/refund ratios.

• Demographic and geographic signals: cus-

tomer location identiﬁers (city, district, postal

code).

• Channel and payment preferences: order ori-

gin (e.g., app vs. web), payment method, issuing

bank.

• Operational and logistic features: address

stability, historical claim activity, fraud-related

markers.

While this comprehensive feature space captured

diverse aspects of customer behavior, it also intro-

duced redundancy and noise. To address this, a multi-

stage feature selection pipeline was applied:

1. Filter methods: removal of constant or low-

variance features and those with excessive miss-

ingness.

2. Correlation analysis: elimination of highly

collinear attributes.

3. Model-based ranking: importance ranking us-

ing tree-based methods (e.g., LightGBM gain and

split metrics).

4. Iterative pruning: evaluation of reduced subsets;

only predictors improving stability were retained.

Through this systematic process, the original

∼600 attributes were reduced to a ﬁnal set of about

40 highly informative predictors, ensuring both model

efﬁciency and interpretability.

3.3 Preprocessing

Preprocessing steps included:

• Missing value imputation: mode ﬁlling for cate-

gorical features and domain-speciﬁc rules for nu-

merics.

• Data type normalization: categorical identiﬁers

(e.g., district/postal codes) restored from ﬂoat to

categorical.

• Encoding: categorical features encoded using la-

bel encoding, ensuring consistency and avoiding

leakage.

• Scaling: continuous features standardized to zero

mean and unit variance.

• Chronological split: training/validation sets split

by time, ensuring temporally consistent evalua-

tion.

These procedures ensured a robust, representative

dataset for subsequent modeling.

4 METHODOLOGY

The goal of this study is to predict the number of prod-

ucts a user will purchase in the upcoming month, cat-

egorized into three classes: 1, 2–3, and 4+. We in-

vestigated multiple modeling strategies to address this

problem. Two different modeling strategies were in-

vestigated.

4.1 Direct Classiﬁcation

Standard supervised classiﬁcation models were

trained directly on the three classes. Models such

as CatBoost, LightGBM, XGBoost, Extra Trees, and

Logistic Regression were evaluated.

4.2 Regression with Post-Processing

As an alternative, the task was reformulated as a re-

gression problem, where models predict a continu-

ous estimate of the expected order count. The pre-

dicted values were then mapped into the predeﬁned

intervals (1, 2–3, 4+) to obtain class labels. This

approach provides a more natural formulation of the

task, since the number of orders is inherently a count

variable. By treating the problem as regression, the

model captures the magnitude of the outcome and

preserves the ordinal structure of the classes, whereas

direct classiﬁcation ignores the ordering among cate-

gories. Our results demonstrate that regression-based

models achieve performance comparable to, and in

some cases superior to, classiﬁcation models in terms

of precision, recall, and F1 score, making regression

a more advantageous approach in this context.

Table 1: Model performance comparison for order predic-

tion task.

Algorithm Accuracy Precision Recall F1

XGBoost Reg 0.590 0.608 0.522 0.539

LightGBM Reg 0.590 0.607 0.522 0.539

CatBoost Reg 0.591 0.607 0.520 0.538

Linear Reg 0.580 0.612 0.512 0.527

CatBoost Clf 0.628 0.569 0.519 0.524

LightGBM Clf 0.628 0.570 0.518 0.523

XGBoost Clf 0.628 0.569 0.517 0.523

ExtraTrees Reg 0.573 0.614 0.500 0.514

ExtraTrees Clf 0.618 0.556 0.492 0.490

Logistic Clf 0.604 0.509 0.482 0.445

ICEEECS 2025 - International Conference on Advances in Electrical, Electronics, Energy, and Computer Sciences

Table 2: Class based baseline model performance.

Class Precision Recall

1 (single purchase) 0.70 0.83

2–3 (mid-frequency) 0.46 0.44

4+ (heavy buyers) 0.55 0.29

Table 1 presents the performance comparison of

different regression and classiﬁcation models for the

order prediction task. Overall, classiﬁcation mod-

els achieve slightly higher accuracy than regression-

based models, with CatBoost, LightGBM, and XG-

Boost classiﬁers all reaching an accuracy of 0.628,

compared to around 0.59 for the best regression mod-

els. However, the improvement in accuracy does not

translate into substantial gains in other metrics.

Regression models, particularly Linear Regres-

sion and Extra Trees Regression, demonstrate higher

precision (up to 0.614) compared to classiﬁcation

models. This indicates that regression tends to pro-

duce fewer false positives when mapping predictions

to classes. On the other hand, classiﬁcation models

achieve slightly better balance in terms of recall and

F1 score, though none of the models surpass 0.54 in

F1.

Among classiﬁers, CatBoost, LightGBM, and

XGBoost perform similarly and represent the

strongest baselines. Logistic Regression, while sim-

pler, underperforms especially in F1 score (0.445).

For regression, CatBoost Regression and LightGBM

Regression yield the best overall balance of metrics.

Taken together, the results suggest that while clas-

siﬁcation yields marginally higher accuracy, regres-

sion offers better precision and leverages the ordinal

structure of the problem. However, all baseline mod-

els struggled to reliably capture the 4+ class, which

highlighted the need for a targeted binary formula-

tion. As shown in Table 2, even the best-performing

baseline models exhibit limited recall for this seg-

ment. This observation motivated the design of the

proposed hybrid approach, which combines regres-

sion with binary classiﬁcation to improve robustness

across all classes.

4.3 Binary Classiﬁcation

Since the initial models struggled to correctly identify

users in the 4+ category, we reformulated the task as

a binary classiﬁcation problem: predicting whether a

user will place 4+ orders or not.

Table 3 summarizes the performance of binary

classiﬁers trained to detect whether a user belongs to

the 4+ category. Gradient boosting models (Light-

GBM, XGBoost, and CatBoost) achieve the best over-

all balance, with accuracies of 0.882, AUC values

Table 3: Binary classiﬁer performance for 4+ category pre-

diction.

Model Acc AUC Prec Rec F1

LightGBM Binary 0.882 0.850 0.683 0.342 0.456

XGBoost Binary 0.882 0.850 0.685 0.342 0.456

CatBoost Binary 0.882 0.849 0.689 0.336 0.451

Logistic Binary 0.875 0.803 0.651 0.291 0.403

ExtraTrees Binary 0.868 0.832 0.843 0.106 0.189

around 0.85, and F1 scores near 0.45. Although recall

remains relatively low (≈0.34), these models demon-

strate higher precision, indicating that when they pre-

dict a user as 4+, it is often correct.

Logistic Regression performs slightly worse, with

reduced AUC and F1. Extra Trees yields the highest

precision (0.843) but suffers from extremely low re-

call (0.106), meaning it correctly identiﬁes very few

of the actual 4+ users.

Overall, boosting-based methods provide the most

reliable trade-off, though the results also highlight the

inherent difﬁculty of predicting the 4+ class due to its

limited representation in the dataset.

4.4 Hybrid Method

To further improve performance, we introduced a hy-

brid methodology that combines binary classiﬁcation

with regression. In this setup, a binary classiﬁer

(LightGBM

Binary) is ﬁrst used to predict whether

a user belongs to the 4+ category. If the prediction is

negative, a regression model is applied to estimate the

expected purchase count, which is then mapped into

the 1 or 2–3 categories.

This hybrid framework leverages the strengths of

both approaches: the binary classiﬁer improves de-

tection of the 4+ class, while regression preserves

the ordinal structure of the remaining categories.

Compared to baseline model(XGB Regression), this

method achieved approximately a 2% improvement

across evaluation metrics, which can be seen in Table

4, demonstrating its robustness and practical value.

Furthermore, class-level results in Table 5 conﬁrm

that the hybrid approach substantially increases recall

for the 4+ segment while maintaining balanced per-

formance on the other classes.

Table 4: Performance comparison of hybrid approach

(LightGBM Binary + XGB Regression) against baseline.

Method Accuracy Precision Recall F1

Baseline 0.628 0.570 0.518 0.523

Hybrid 0.641 0.582 0.529 0.534

Figure 1 presents the two-stage framework that we

developed.

Predicting Purchase Frequency in e-Commerce: Hybrid Machine Learning Approach

Table 5: Class based hybrid model performance.

Class Precision Recall

1 (single purchase) 0.72 0.80

2–3 (mid-frequency) 0.48 0.46

4+ (heavy buyers) 0.55 0.33

Figure 1: Two Stage Cascade Model.

5 CONCLUSIONS AND FUTURE

WORKS

In this study, we addressed the problem of predict-

ing the number of products a user will purchase in

the upcoming month. We explored multiple modeling

strategies, including direct classiﬁcation, regression

with interval mapping, and a hybrid approach com-

bining binary classiﬁcation for the 4+ category with

regression for the remaining classes.

Our results indicate that while classiﬁcation mod-

els achieve slightly higher overall accuracy, regres-

sion captures the ordinal and count-based nature of

the target variable, resulting in better precision and

more meaningful predictions. The proposed hybrid

methodology successfully balances the strengths of

both approaches, improving detection of the 4+ class

while retaining robust performance across all cate-

gories. This demonstrates the effectiveness of com-

bining regression and binary classiﬁcation for class-

imbalanced, ordinal prediction tasks in a real-world

e-commerce context.

For future work, we plan to explore modern

attention-based tabular learning algorithms, such as

TabNet, TabTransformer, and FT-Transformer, to fur-

ther improve prediction accuracy and interpretabil-

ity. These models have shown strong performance

on structured data by leveraging self-attention mech-

anisms to capture complex feature interactions. Addi-

tionally, we aim to investigate temporal and sequen-

tial patterns in user purchasing behavior, incorpo-

rating recurrent or transformer-based architectures to

model dynamics over time. Combining these modern

tabular methods with our hybrid framework may fur-

ther enhance predictive performance, especially for

4+ category.

ACKNOWLEDGEMENTS

This project was made possible by the individual con-

tributions of each member of the recommendation

team within Hepsiburada technology group. Also,

this project would not have been possible if the tech-

nology group management of Hepsiburada had not

supported and encouraged the data science team in

innovation.

REFERENCES

Abe, M. (2009). Counting your customers one by one: A

hierarchical bayes extension to the pareto/nbd model.

Marketing Science, 28(3):541–553.

Fader, P. S., Hardie, B. G. S., and Lee, K. L. (2005). Count-

ing your customers the easy way: An alternative to

the pareto/nbd model. Marketing Science, 24(2):275–

284.

Grinsztajn, L., Oyallon, E., and Varoquaux, G. (2022).

Why do tree-based models still outperform deep learn-

ing on tabular data? In Advances in Neural In-

formation Processing Systems (NeurIPS). Available:

https://arxiv.org/abs/2207.08815.

Kim, Y., Lee, S., and Jang, H. (2023). Customer

lifetime value prediction using deep learning: A

transformer-based approach. Applied Artiﬁcial Intel-

ligence, 37(2):147–163.

Lathwal, P. and Batra, R. (2024). Attention-based cus-

tomer lifetime value prediction in e-commerce using

ft-transformer architecture. Amity Journal of Data and

Cyber Sciences. Available: https://www.amity.edu/gu

rugram/jccc/pdf/JDCS\ 0205.pdf.

Oblander, E. S., Gupta, S., Mela, C. F., Winer, R. S., and

Lehmann, D. R. (2020). The past, present, and future

of customer management. Marketing Letters, 31(2–

3):125–136.

Song, P. and Liu, Y. (2020). An xgboost algorithm for

predicting purchasing behaviour on e-commerce plat-

forms. Tehni

cki vjesnik, 27(5):1467–1471.

Sun, Y., Cheng, D., Bandyopadhyay, S., and Xue, W.

(2021). Proﬁtable retail customer identiﬁcation based

on a combined prediction strategy of customer life-

time value. Midwest Social Sciences Journal, 24(1).

Verbeke, W., Martens, D., and Baesens, B. (2014). Social

network analysis for customer churn prediction. Ap-

plied Soft Computing, 14:431–446.

Wang, W., Xiong, W., Wang, J., Tao, L., Li, S., Yi, Y., and

Zou, X. (2023). A user purchase behavior prediction

method based on xgboost. Electronics, 12(9):2047.

Yang, L., Niu, X., and Wu, J. (2021). Rf-lightgbm: A prob-

abilistic ensemble way to predict customer repurchase

behaviour in community e-commerce. arXiv preprint

arXiv:2109.00724.

ICEEECS 2025 - International Conference on Advances in Electrical, Electronics, Energy, and Computer Sciences