objective is to enhance the precision and resilience of
financial market predictions (Wu et al., 2021).
Although a review study by Greg Van Houdt et al.
proposed the use of Vanilla LSTM, which is the basic
form of the LSTM, for time series tasks has the best
performance (Houdt et al., 2020). The fact is that the
CNN-LSTM model has been widely adopted and
researched for stock price forecasting and has
achieved an exceptional degree of precision. In the
study conducted by Lu et al., this integrated model
shown superior accuracy and performance compared
to other models including Recurrent Neural Network
(RNN), CNN, Multilayer Perceptron (MLP), LSTM,
and CNN-RNN (Lu et al., 2020). In addition to this,
Can Yang and colleagues have demonstrated that the
model also achieves better results after ranking stock
indices using the PPMCC (which is a statistical
measure called Pearson Product Moment Correlation
Coefficient) prior to training (Yang et al., 2020).
Firuz et al. used data from ten major U.S. companies
over a ten-year period, and Jimmy Ming-Tai Wu et al.
applied to ten stocks in the U.S. and Taiwan, and both
studies achieved good forecasting results (Kamalov et
al., 2021; Wu et al., 2021). Moreover, augmenting the
model by denoising historical stock data through
wavelet transform or integrating the attention
mechanism can further improve its ability to detect
key patterns, thereby increasing its accuracy (Qiu et
al., 2020). In addition, it has been shown that hybrid
models that incorporate both technical and
macroeconomic indicators tend to capture a wider
range of factors affecting stock prices, leading to
better results.
This paper is inspired by Widodo Budiharto's
research, which utilized R programming and LSTM
models to analyse stock price predictions in Indonesia
throughout the COVID-19 period (Widodo et al.,
2021). After the country’s first confirmed COVID-19
case on March 2, 2020, Indonesia's Benchmark Stock
Index dropped sharply by 28% before next year.
Widodo’s work employed big data provided by
Yahoo Finance, targeting major banks, specifically
Bank Central Asia (BCA) and Bank Mandiri. His
experiments showed that data science and LSTM
models were highly effective at predicting key market
prices, including the opening, highest, lowest, and
closing figures (OHLC), with an accuracy rate of
94.57%. Building on the demonstrated effectiveness
of LSTM models for short-term stock prediction, this
study seeks to apply a similar methodology in a
different context. The primary target is to predict the
next-day closing prices for 40 stocks in the medical
device sector, using data from yfinance started at
January 1, 2022, to the present (August, 2024). The
goal is to gain insight into stock price trends in a post-
pandemic market environment, particularly in an
industry that has been heavily impacted by pandemic.
The rest part of the paper is organized as follows. The
Sec. 2 covers the data collection process, stock
selection, preprocessing, and CNN-LSTM model
architecture. Sec. 3 displays the results of model
predictions, incorporating performance metrics and
contrasting them with other models. Sec. 4
summarizes key findings, conclusions, and directions
for further studies.
2 DATA AND METHOD
This study utilizes data from Yahoo Finance,
focusing on 40 companies in the medical device
sector. The records spans at January 1, 2022, to
August 25, 2024, offering an evaluation of stock
market movements in the aftermath of the COVID-19
crisis. The yfinance Python module was utilized to
acquire the data. It provides daily stock price
indications, such as open, high, low, close, volume,
and other financial-related details. Technical
indicators, including Stochastic Oscillator Indicator
(KDJ), Moving Average Convergence Divergence
(MACD), Relative Strength Index (RSI), Bollinger
Bands, and moving averages, were calculated through
the stockstats library.
The dataset comprises both independent variables
and a dependent variable (the stock's closing price).
The independent variables encompass a range of
stock-related data, including opening price, highest
and lowest price, as well as trading volume. In
addition, financial ratios such as market
capitalization, PB ratio, and PS ratio were considered.
Furthermore, the analysis included technical
indicators such as moving averages, RSI, and MACD.
All of these variables were employed in forecasting
the closing price for the following trading day.
To facilitate the development of the model, the
dataset was divided into three distinct segments: Only
70% of the dataset was allocated to the training set,
while the remaining 15% was reserved for validation,
and testing received the remaining 15%. The
validation dataset was utilised to fine-tune the
hyperparameters of the approach, while the training
phase facilitated the model's learning process. Lastly,
the test set served as a foundation for assessing the
model's overall effectiveness. The data were
standardized using the MinMaxScaler function to
increase the model's efficacy and prevent scaling
discrepancies. By combining stock prices, technical
indicators, and financial ratios, the dataset provides
Forecasting of Share Prices Based on Hybrid Model of CNN and LSTM: A Multi-Factor Approach