Review of the Application of Algorithms in Stock Prediction
Ningkai Wang
a
College of Transportation, Tongji University, Shanghai, China
Keywords: Stock Prediction, Sorting Algorithm, Regression Algorithm.
Abstract: This paper discusses the current research status of stock prediction using algorithms. Firstly, the basic
concepts of stocks and the importance of stock prediction were introduced. Then, the application principles
of classification algorithms and regression algorithms in stock prediction are summarized, and the research
on machine learning and deep learning algorithms in predicting stock trends is analyzed. The article points
out that classification algorithms perform outstandingly in predicting the types of stock price fluctuations,
while regression algorithms have significant advantages in stock price prediction. However, the current
research still has some limitations, such as the risk of model overfitting, the difficulty in quantifying market
emergencies, and ignoring macroeconomic or sentiment factors, etc. Therefore, the article proposes future
research directions, including exploring the combination of deep learning algorithms and traditional machine
learning algorithms, and integrating theories of financial psychology, etc. All in all, this paper
comprehensively reviews the research progress of stock prediction using algorithms and points out the
direction of future research, providing valuable references for investors and researchers.
1 INTRODUCTION
Stocks are ownership certificates issued by joint-
stock companies, representing a type of security that
these companies issue to various shareholders as
shareholding certificates for the purpose of raising
funds, and which entitle shareholders to dividends
and bonuses. Each share of stock represents a basic
unit of ownership that the shareholder has in the
enterprise. Through the stock market, investors can
invest their funds in different companies and
industries, thereby effectively diversifying
investment risks. With the development of the
economy, stocks have gradually become an important
investment channel. Stock prediction is a way to
forecast the future development direction and
fluctuation degree of the stock market based on the
development of stock market conditions. The
complexity and uncertainty of the stock market pose
significant challenges for investors when making
decisions. Through stock prediction, investors can
better grasp opportunities, reduce investment risks,
optimize investment strategies, and maximize
returns. Therefore, how to accurately predict stock
trends has become a key issue, and more and more
a
https://orcid.org/0009-0004-9511-5579
scholars are attracted to engage in research.
Traditional stock analysis methods include
fundamental analysis, technical analysis, quantitative
analysis and so on. These methods have some model
risks, require high accuracy in the trading market, and
have limitations in their scope of application.
Nowadays, with the continuous development of
computer technology, it has become possible for us to
predict stock trends using computer algorithms.
However, due to the impact of various unexpected
situations on stock data, stock data contains some
missing and abnormal values, so it is necessary to
select appropriate models to process the data.
This article starts from the basic theory of stocks,
introduces the basic principles of classification and
regression prediction methods, and then introduces
the effects of machine learning and deep learning
prediction methods on stock trend prediction through
literature. Finally, it analyzes the limitations of using
the above two methods to analyze stock trends, as
well as prospects for future research directions.
Wang, N.
Review of the Application of Algorithms in Stock Prediction.
DOI: 10.5220/0014324700004718
Paper published under CC license (CC BY-NC-ND 4.0)
In Proceedings of the 2nd International Conference on Engineering Management, Information Technology and Intelligence (EMITI 2025), pages 191-195
ISBN: 978-989-758-792-4
Proceedings Copyright © 2025 by SCITEPRESS Science and Technology Publications, Lda.
191
2 BASIC THEORIES
The stock is a certificate issued by a joint-stock
limited company to prove the shares held by
shareholders. It indicates that the Stockholders have
ownership of part of the capital of the joint-stock
company. Because stocks contain economic benefits
and can be traded and transferred on the market,
stocks are also a kind of valuable security. The stock
market generates a large amount of trading data every
day. These data often hide a great deal of useful
information, but they are not easily discovered by
people(Wang, 2015).
Stocks have the following basic concepts: The
opening price is the price of the first transaction in the
bidding stage. If there is no transaction, the closing
price of the previous day is the opening price. The
price of the last stock transaction in each day's trading
is the closing price. The highest price among the
transaction prices on that day is the highest price. The
lowest price among the transaction prices on that day
is the lowest price. A quote is the highest purchase
price or the lowest offer made by a trader in the
securities market for a certain security within a
certain period of time. A quote represents the highest
price that both the buyer and the seller are willing to
offer. The purchase price is the price at which the
buyer is willing to buy a certain security, and the offer
is the price at which the seller is willing to sell. The
order of quotations is conventionally that the quoted
price comes first and the quoted price comes later.
Stocks are generally classified into four types:
common stocks, preferred stocks, blue-chip stocks,
and rights issues. Common stock refers to the shares
that enjoy ordinary rights in the operation and
management, profits and property distribution of a
company. It represents the right to claim the
company's profits and remaining assets after meeting
all the requirements for debt repayment and the
income and claim rights of preferred shareholders.
Preferred stock is in contrast to common stock. It
mainly refers to having priority over common stocks
in terms of the rights to profit dividends and the
distribution of remaining assets. Blue-chip stocks are
the stocks of companies with excellent performance
but a relatively slow growth rate. After-rights issue is
a type of stock that is at a disadvantage compared to
common stocks in terms of dividend distribution of
benefits or interest and distribution of remaining
assets. Generally, it is distributed after the distribution
of common stocks, and the remaining benefits are
redistributed.
Algorithmic trading reduces the effect of investor
sentiment and the accumulation of long-term
experience, and is widely applied in the stock
market(Xiao & Wei, 2018). The algorithms for stock
prediction can be roughly divided into two categories:
classification algorithms and regression algorithms.
In stock prediction, the changes in stock prices are
usually classified as "rising", "falling", and
"unchanged". Classification algorithms can provide
prediction results more quickly and are suitable for
real-time decision-making. The commonly used
evaluation indicators include MSE, MAE, RMSE,
R2, etc. Stock prediction usually designs time series
data. Through regression algorithms, the relationship
between historical prices and other related features
can be analyzed to predict future trends. Commonly
used evaluation indicators include accuracy rate,
precision, and recall rate, etc.
3 ANALYSIS OF STOCK
PREDICTION BY DIFFERENT
ALGORITHMS
3.1 Sorting Algorithms
Wang simulation experiments were carried out using
MATLAB. Through the analysis of the experimental
results, the feasibility of wavelet neural networks and
support vector machines in stock prediction was
proved. Wavelet neural network is a theoretical
method that combines wavelet and neural networks.
This method integrates the advantages of both
theories and is an effective stock prediction method
(Wang, 2019).
The author regarded the data of 168 months as a
set of time series. The training set samples were the
data of the first 106 time points, and the test set
samples were the last 62 time points. Training was
conducted respectively using PSO-WNN and PSO-
SVM. Finally, the predicted values output by the
trained PSO-WNN and PSO-SVM are fitted with the
test set data, and the prediction results of the two
models are presented.
From the results, after the training of the PSO-
WNN model, the accuracy rate of the Shanghai
Composite Index was 59.68% and it took 72.3
seconds, and the accuracy rate of the Shenzhen
Composite Index was 56.45% and it took 151.3
seconds. After the training of the PSO-SVM model,
the accuracy rate of the Shanghai Composite Index
was 58.06%, taking 4.54 seconds, and the accuracy
rate of the Shenzhen Composite Index was 64.52%,
taking 4.97 seconds. It can be seen from this that this
model has a relatively high accuracy rate of stock
EMITI 2025 - International Conference on Engineering Management, Information Technology and Intelligence
192
prediction and also has high efficiency. However,
optimization is still needed in terms of the learning
rate and training accuracy of the neural network.
Deng Common technical indicators, as well as
major international indices, the basis between futures
and spot, interest rate markets, exchange rates and
other indicators are incorporated into the
characteristics of the CSI 300 Index. On this basis, a
random forest model is established to conduct index
prediction. The wavelet denoising method commonly
used in signal processing is borrowed to filter out
short-term disturbances, so that the long-term trend
can be reflected. The exponential trend after wavelet
denoising is taken as one of the characteristics of the
random forest(Deng, 2019).
The author selected trading data from a total of
500 trading days from April 16, 2015, to December
18, 2018. The first 100 groups of data were the
training set and then predicted the rise and fall on the
101st day based on the latest market. Then the
training set rolls forward by one day, and this cycle
repeats, with a total of 400 predictions made.
From the analysis of the results, when backtesting
with test samples and considering the trend of
exponential filtering, the prediction accuracy rate is
57%. It is indicated that the random forest model
established based on indicators such as peripheral
stock indices, futures, interest rates, and exchange
rates has a certain predictive ability for the index.
However, from the perspective of the financial
market, the longer the time, the greater the probability
of various events occurring, and the greater the
possibility of market style transformation. That is to
say, the longer the training set, the lower the model's
fitting ability to the data within the sample, and the
greater the model variance.
Long et al. based on the basic idea of the SVDD
algorithm, the fuzzy kernel hypersphere fast
classification algorithm is proposed. The hypersphere
set is found through the merging method, and the
classifier is constructed according to the principle of
maximum membership degree. The overlapping
problem of outliers and hypersphere sets is excluded,
and the complex secondary programming is avoided
at the same time. It has the characteristics of fast
classification speed and high accuracy of
classification results(Long & Zhang, 2014).
The author took the radial basis kernel function as
the experimental kernel function and adopted the
2006 annual report data of 200 listed companies on
the Shanghai Stock Exchange after preprocessing
(t=7) as the training set. The hyperball set was learned
to predict and classify the situations in 2007 and
2008. The classification results were verified for
accuracy by the annual report data for these two
years. The average accuracy rate is the arithmetic
mean of the accuracy rates of the three categories over
the above two years.
From the analysis of the results, the sub-
hypersphere SVM method has an average accuracy
rate higher than that of the classic SVDD improved
algorithm due to the adoption of certain improvement
measures to correct misclassification. This algorithm
plays a good role in optimizing the classifier effect
and improving the classifier performance.
3.2 Regression Algorithm
Li et al. application of linear regression in supervised
machine learning in quantitative trading is studied.
Based on the abu quantitative system, a specific
trading environment is constructed, and the linear
regression module of sklearn is used for simulation
analysis. The feasibility of quantitative trading is
demonstrated to a certain extent according to the
obtained prediction results (Li & Xia, 2023).
The author uses Anaconda3, assuming that the
trading volume and closing price of the day before
yesterday, as well as those of yesterday, are factors
influencing stock prices, and constructs a model. In
the model, it is constructed through the positive and
negative signs of the price difference, volume
difference, and the product of the price difference and
volume difference. Since it is impossible to analyze
all the correct and true characteristic factors, some
noise features are introduced: the price product and
the volume product.
From the analysis of the results, in most cases, the
predicted fluctuation range is roughly consistent with
the actual fluctuation range of the stock price.
However, in the real market, the factors that can
influence stock price trends are complex. The author's
consideration is rather idealized and there is still
considerable room for optimization.
Ma applying two supervised learning algorithms,
namely regression and classification, to predict the
trend of stock prices, good model results have been
achieved. This emphasizes the urgency of
scientifically predicting trends in the stock market,
provides investors with more accurate market
information, and offers a useful reference for future
stock market predictions (Ma, 2024).
The author analyzes the overall trend of stock
price data through K-line chart analysis, then uses a
linear regression model to predict stock prices,
calculates indicators such as mean square error, and
then uses two classification models, logistic
Review of the Application of Algorithms in Stock Prediction
193
regression and random forest, to predict the rise and
fall of stocks and evaluate their accuracy.
Judging from the result, the square value of R2 is
close to 1; Random forest performs outstandingly in
the classification task, with an AUC value of 0.731. It
is concluded that linear regression has a relatively
high accuracy rate in predicting stock prices, and the
random forest is relatively accurate in predicting the
types of stock price rises and falls.
Shen based on the stepwise regression algorithm
and CART decision tree algorithm in data mining
technology, A stepwise regression algorithm based
decision tree is proposed and applied to stock
prediction. Taking the financial indicators of the
annual reports of A-share listed companies as the
analysis object, the stocks are predicted and analyzed
(Shen, 2017).
The author screened out the financial indicators in
the annual reports of 2007 A-share listed companies
in 2013, 2014 and 2015 as the analysis objects,
established the relevant models using SPSS Modeler
software, with 70% of the data as the training set and
30% of the data as the test set. Establish the CART
decision tree algorithm and the grade classification
prediction model, analyze and improve the CART
decision tree model, and finally establish the stepwise
regression algorithm based on the decision tree and
the model of financial indicators of listed
companies(Tao, 2023; Zhang, 2023).
From the results, the correct rate in the test set is
91.05% and the error rate is 8.95%, indicating that the
prediction effect of the CART decision tree algorithm
is relatively good. However, there are numerous
financial indicators. If the author only uses return on
net assets and net assets per share to judge the quality
of stocks, there will be certain errors. Further
optimization is needed to improve the accuracy.
4 CONCLUSIONS
Starting from the basic theory of stocks, this paper
elaborates in detail the core principles of
classification algorithms (such as decision trees,
random forests, support vector machines, etc.) and
regression algorithms (such as linear regression,
support vector regression, etc.), as well as their
specific application scenarios in the prediction of
stock price trends. And through a review of the
literature, presents the academic attempts at stock
prediction starting from classification algorithms and
regression algorithms.
From the analysis of the results, classification
algorithms perform outstandingly in the prediction of
stock rise and fall types. For example, random Forest,
with its ensemble learning characteristics, shows a
relatively high classification accuracy rate in a
complex market environment. Regression algorithms
have significant advantages in stock price prediction.
For example, linear regression models can fit
historical data well. With the continuous development
of machine learning and deep learning technologies,
quantitative investment will further deepen the
integration of technology and applications in the
future, achieving more intelligent, adaptive and
efficient trading decisions, thereby bringing greater
returns and reducing risks to investors.
However, the current research still has limitations:
the risk of model overfitting, the difficulty in
quantifying market emergencies, and the neglect of
macroeconomic or emotional factors. Therefore, I
believe that future research can start from the
following aspects: exploring the combination of deep
learning (such as LSTM, Transformer) and traditional
machine learning algorithms. Combining the theories
of finance and psychology, this paper explores the
influence mechanism of investors' psychological
factors on stock price fluctuations. Introduce more
multi-dimensional data sources to enhance the
model's adaptability to complex market
environments. Future research can explore other
feature extraction methods, such as feature extraction
based on fundamentals and feature extraction based
on news events, to further improve the accuracy and
reliability of predictions.
REFERENCES
Deng, Y. (2019). Based on wavelet denoising and csi 300
index selection strategy of random forest algorithm
[Master’s thesis, Huazhong University of Science and
Technology].
Li, X., & Xia, H. (2023). Stock regression prediction
research based on machine learning algorithm. Journal
of Information Science and Technology, 21(14), 227–
231.
Long, Z., & Zhang, Z. (2014). Application of fast
classification algorithm based on fuzzy kernel
hypersphere in stock prediction. Applications of
Computer Systems, 23(01), 197–201+148.
Ma, J. (2024). Regression and classification of the stock
price trend forecast research. Computer Knowledge and
Technology, 20(12), 12–14+23.
Shen, J. (2017). Stepwise regression algorithm based on
decision tree and its application in stock prediction
[Master’s thesis, Guangdong University of
Technology].
EMITI 2025 - International Conference on Engineering Management, Information Technology and Intelligence
194
Tao, Y. (2023). Research on algorithmic trading strategies
under deep learning [Master’s thesis, North China
University of Water Resources and Electric Power].
Wang, Q. (2015). The application of improved support
vector machine technology in short-term stock price
prediction [Master’s thesis, Chongqing Jiaotong
University].
Wang, Z. (2019). Stock prediction and optimization based
on wavelet neural network and support vector machine
[Master’s thesis, Anqing Normal University].
Xiao, Z., & Wei, Z. (2018). Application of random forest in
stock trend prediction. China Management
Informatization, 21(03), 120–123.
Zhang, B. (2023). Research on improved LSTM stock price
prediction algorithm based on stock comment
sentiment analysis and feature engineering [Master’s
thesis, University of Electronic Science and
Technology].
Review of the Application of Algorithms in Stock Prediction
195