2 DESCRIPTIONS OF MODELS
This study focuses on two models that have attracted
much attention in the field of stock prediction: the
autoregressive integral Moving average model
(ARIMA) based on time series analysis and the Long
short-term memory network (LSTM) model using
deep learning technology. With their unique
advantages, these two models show different
application potential and value in the volatility
prediction of stock market.
ARIMA Model (Autoregressive Integrated
Moving Average Model) is a classical method for
predicting and analysing non-stationary time series
data in the field of time series analysis. In stock
prediction, ARIMA model transforms non-stationary
stock price time series into stationary series through
difference technology, and then uses autoregression
(AR) and moving average (MA) parts to capture
autocorrelation and random error terms in the data.
Among them, the dependent variable is usually the
stock price or the rate of return, while the independent
variable includes the historical price data and its
difference, lag term, etc. The advantage of ARIMA
model is that it has a solid theoretical foundation and
good short-term prediction effect, but it may be
limited by the assumptions of stationarity and linear
relationship of data (Wu & Wen, 2016).
The LSTM model, or Long Short-Term Memory
network, is a special type of recurrent neural network
(RNN), which solves the gradient disappearance or
gradient explosion problem that traditional RNNS are
prone to when dealing with long sequences by
introducing a "gate" mechanism (forgetting gate,
input gate, output gate). In stock forecasting, LSTM
model can capture the long-term dependence of stock
price time series and effectively deal with the
nonlinear characteristics of the market. The inputs to
the model typically encompass market indicators
such as stock price, volume, opening price, closing
price, and so forth. In contrast, the output represents
the stock price or yield forecast at a specified future
point in time. The LSTM model offers significant
advantages in terms of its capacity for nonlinear
modelling and long-term information memory.
However, it is a highly computationally complex
model, and the parameters are challenging to adjust
(Peng, 2019).
In conclusion, the ARIMA and LSTM models
each possess distinctive advantages in the context of
stock forecasting. The former is more appropriate for
short-term scenarios with evident linear trends,
whereas the latter is better at addressing long-term,
non-linear and intricate market dynamics. According
to information characteristics and prediction needs,
the best model or mix of models can be chosen in
practice.
3 ARIMA
The ARIMA model is famous in time series analysis
and has several stock pricing prediction applications.
Officially, ARIMA is the Autoregressive Integrated
Moving Average Model. The formal representation of
the statistical model is ARIMA (p, d, q), whereby p
represents the quantity of autoregressive parts, d
denotes the degree of distinction, and q signifies the
quantity of moving average terms (Narendra &
Eswara, 2015; Zheng et al., 2016).
The model is ideal for non-stationary time series
data management. Data is converted into stationary
sequences using differencing techniques, allowing
predictive analysis. Practical ARIMA model
construction follows a disciplined process. A
stationarity test on the dataset using the Augmented
Dickey-Fuller (ADF) test is a first step. If the
variables are non-stationary, differencing is used until
they become stationary. The model orders p, d, and q
are determined by graphing the autocorrelation
function (ACF) and partial autocorrelation function
(PACF) or using information criteria like AIC and
BIC for model selection. Using historical data, the
model parameters are computed, and the fit is
assessed. The application of residual analysis enables
the evaluation of the model's capacity to successfully
record the latent information within the dataset. The
ARIMA model can predict stock prices using
previous stock price data. An ARIMA model using
historical closing price data for a corporation
illustrates this notion. The model predicts future price
values and confidence intervals within a given
timeframe. The projected results may help investors
make informed investment strategy decisions.
Upon examination of the data shown in Table 1
and Table 2, it is evident that the ARIMA (3,1,1)
model exhibits the most minimal P-value, which is
below the preset significance threshold of 5%.
Furthermore, upon conducting a comparative
examination of various statistical indicators, it
becomes apparent that the ARIMA (3,1,1) model has
greater performance in comparison to the other three
models. Significantly, it demonstrates the greatest F-
statistic of 9.814915 in comparison to the other two,
accompanied by the lowest P-value. Therefore,
within the framework of generating short-term
predictions for the Huatai Securities Index, this
research utilized the ARIMA (3,1,1) model as the