4 CONCLUSIONS
The e-commerce industry has been seen a rapid
growth in recent years. Customers are enabled to
access a wide variety of products and purchase the
things from the comfort of their homes. However, this
has presented numerous new challenges to
businesses, particularly in sales, which is the most
critical aspect in e-commerce. To gain insights and
engineer features, the study employed the EDA and
RFM models. Then, several predictive models such
as Artificial Neural Network, Linear Regression,
Decision trees, and Random Forest were used to
predict sales. Additionally, the performance of these
different models was measured and compared based
on various metrics. The comparison revealed that the
method of Random Forest performed the best in terms
of accuracy, followed by the Decision Tree model.
However, both the Artificial Neural Network and
Linear Regression models had relatively lower
accuracies. There are still certain limitations and
future directions. This research could help to have a
better understanding of the dynamics of e-commerce
sales in the UK and can offer valuable insights for
businesses and researchers in the field, thereby
enhancing sales prediction accuracy and decision-
making.
The study has limitations that future research
should address. In feature engineering, not all features
were fully utilized, potentially missing hidden
patterns. Future efforts need to explore more
comprehensive techniques to extract deeper insights.
Generalizability is a concern as the model may not
work well on new e-commerce datasets.
Incorporating diverse datasets and cross-validation
can improve applicability. Additionally,
experimenting with its hyperparameters and
architectures can enhance e-commerce sales
predictions. Validating on external datasets is crucial
for understanding robustness and effectiveness. By
focusing on these areas, future research can build on
current findings and provide more reliable solutions
for e-commerce businesses.
REFERENCES
James, G., Witten, D., Hastie, T., Tibshirani, R., & Taylor,
J. (2023). Linear Regression. An Introduction to
Statistical Learning, 69–134.
Komorowski, M., Marshall, D. C., Salciccioli, J. D., &
Crutain, Y. (2016). Exploratory Data Analysis.
Secondary Analysis of Electronic Health Records, 185–
203.
Rigatti, S. J. (2017). Random Forest. Journal of Insurance
Medicine, 47(1), 31–39.
Segal, T. (2022, November 19). Inside Recency,
Frequency, Monetary Value (RFM). Investopedia.
https://www.investopedia.com/terms/r/rfm-recency-
frequency-monetary-value.asp
Song, Y. Y., & Lu, Y. (2015). Decision tree methods:
applications for classification and prediction. Shanghai
archives of psychiatry, 27(2), 130–135.
Turney, S. (2022, April 22). Coefficient of Determination
(R2) | Calculation & Interpretation. Scribbr.
https://www.scribbr.com/statistics/coefficient-of-
determination/#:~:text=coefficient%20of%20determin
ation-
Usmani, Z. A., Manchekar, S., Malim, T., & Mir, A. (2017).
A predictive approach for improving the sales of
products in e-commerce. 2017 Third International
Conference on Advances in Electrical, Electronics,
Information, Communication and Bio-Informatics
(AEEICB).
Wei, J. T., Lin, S. Y., & Wu, H. H. (2010). A review of the
application of RFM model. African journal of business
management, 4(19), 4199.
Yang, G. R., & Wang, X.-J. (2020). Artificial Neural
Networks for Neuroscientists: A Primer. Neuron,
107(6), 1048–1070.
Zheng, B., Thompson, K., Lam, S. S., Yoon, S. W., &
Gnanasambandam, N. (2013). Customers’ behavior
prediction using artificial neural network. In IIE Annual
Conference. Proceedings (p. 700). Institute of Industrial
and Systems Engineers (IISE).