ARIMA-LSTM robustly provides better results than
the solo counterparts (Abdulrahman et al., 2020).
This paper focuses on predicting the stock price of
Tencent Holdings, a subject that has been explored in
some studies. In a study conducted by Shi and Zhuang,
they compared different soft computing techniques
for prediction of the fusion defect. ANN showed more
accurate performance to predict the fusion output
among all other models used in this research (Shi &
Zhuang, 2019). Wang et al. introduced an Event
Attention Network (EAN) to predict short-term stock
price trends of companies like Tencent using social
media and news data (Wang et al., 2019). Lu et al.
compared stock prices of two internet companies in
China: Tencent and Alibaba using Capital Asset
Pricing Model (CAPM), Dividend Discount Mode
(DDM) and Fama-French Three-Factor Model
(FF3F), and Tencent showed a larger proportion of
expected returns (Lu et al., 2021). Zhou proposed an
LSTM model combined with multidimensional input
and sentiment analysis to improve the predictability
of Tencent's stock price (Zhou, 2021).
This study aims to apply the 1,203 parameters into
ARIMA, SVR and LSTM to find out which model is
predictive power for forecasting Tencent's stock price
movements. The one problem with ARIMA is it
handles short-term linear trends whereas long term
could be non-linear during explained period. SVR
helps in short term predictions because of the kernel
trick that SVRs use by properly taking care of non-
linear relations. Given that LSTM are able to capture
long-term dependencies in time series, this makes
them a very good option for performing long term
forecasts. This paper performs a systematic
comparison of this series predictive power for the
price trajectory. The subsequent sections describe the
data and methods used, a comprehensive analysis on
model performance as well as provide practical
implications for investment decisions. Finally, this
study concluded comparison of all these models that
which model works best for stock price prediction.
2 DATA AND METHOD
The data used in this study was sourced from
investing. com providing 34,909 data of Tencent
Holdings Limited daily since its first listing on HKEx
from June 17, 2004 to September 9, 2024. Entries
have date, close, open, High and Low prices of the
day in HKD, traded volume in million and range of
fluctuation (%). The closing price is used as a
dependent variable to predict changes in stock prices
at the next trading day, whereas opening, high, low
and the others are independent variables that reflect
the direction on how this dynamic may evolve. All
computation for this study was performed on an
environment with TensorFlow 2.9.0, Python 3.8,
CUDA11, 80GB RAM, AMD EPYC 7642 and RTX
3090 via the cloud computing platform AutoDL. To
facilitate model training and testing, the dataset was
divided into two parts: the first 3,990 days of data
were used for model training, and the subsequent 997
days were reserved for testing and evaluating
predictive performance. As ARIMA is a univariate
model, only the date and closing price were used for
its training. On the other hand, SVR and LSTM
employed all variables. Additionally, the data were
normalized before training SVR and LSTM models to
ensure efficient training and accurate predictions.
This paper analyzes and forecasts Tencent's stock
price using three different forecasting models, namely
ARIMA, SVR, and LSTM. Each model uses different
methods to find the optimal parameters and quantifies
the prediction effect of the model through the
evaluation indexes such as Coefficient of
Determination (R²), Mean Squared Error (MSE),
Mean Absolute Error (MAE) and Mean Percentage
Absolute Error (MPAE). In order to select an
appropriate ARIMA model, an Augmented Dickey-
Fuller Test (ADF) was first performed on the closing
price data to determine the smoothness of the data and
the order of difference. The test results indicated that
the closing price data was non-stationary and required
first order differencing. Subsequently, the auto_arima
function was used to automatically select the optimal
model order from 147 parameter combinations based
on the Akaike Information Criterion (AIC). The final
optimal ARIMA model obtained is ARIMA(5,1,3),
i.e. p=5, d=1, q=3. For the SVR model, this paper
optimizes the model parameters by hyperparameter
grid search to find the optimal parameter combination
from 32 different parameter combinations. In the
process of parameter tuning, 5-fold cross-validation
is used and negative_mean_squared_error is used as
the scoring criterion. The final optimal parameter
combination obtained is penalty parameter (C) = 100,
ε-insensitive loss function (epsilon) = 0.01, and
kernel function is linear. The hyperparameters of the
LSTM model were tuned by Keras Tuner, traversing
1024 different parameter combinations. The final
optimal LSTM model consists of two layers of LSTM,
the first layer has 100 neurons and returns sequences,
and the second layer has 100 neurons and does not
return sequences, both with a dropout rate of 0.2. The
model uses Adam's optimizer, with a learning_rate of
0.01, and has been trained with 500 epochs to achieve
stable predictions. Stable prediction results were