Review of the Application of Algorithms in Stock Prediction

Ningkai Wang

College of Transportation, Tongji University, Shanghai, China

Keywords: Stock Prediction, Sorting Algorithm, Regression Algorithm.

Abstract: This paper discusses the current research status of stock prediction using algorithms. Firstly, the basic

concepts of stocks and the importance of stock prediction were introduced. Then, the application principles

of classification algorithms and regression algorithms in stock prediction are summarized, and the research

on machine learning and deep learning algorithms in predicting stock trends is analyzed. The article points

out that classification algorithms perform outstandingly in predicting the types of stock price fluctuations,

while regression algorithms have significant advantages in stock price prediction. However, the current

research still has some limitations, such as the risk of model overfitting, the difficulty in quantifying market

emergencies, and ignoring macroeconomic or sentiment factors, etc. Therefore, the article proposes future

research directions, including exploring the combination of deep learning algorithms and traditional machine

learning algorithms, and integrating theories of financial psychology, etc. All in all, this paper

comprehensively reviews the research progress of stock prediction using algorithms and points out the

direction of future research, providing valuable references for investors and researchers.

1 INTRODUCTION

Stocks are ownership certificates issued by joint-

stock companies, representing a type of security that

these companies issue to various shareholders as

shareholding certificates for the purpose of raising

funds, and which entitle shareholders to dividends

and bonuses. Each share of stock represents a basic

unit of ownership that the shareholder has in the

enterprise. Through the stock market, investors can

invest their funds in different companies and

industries, thereby effectively diversifying

investment risks. With the development of the

economy, stocks have gradually become an important

investment channel. Stock prediction is a way to

forecast the future development direction and

fluctuation degree of the stock market based on the

development of stock market conditions. The

complexity and uncertainty of the stock market pose

significant challenges for investors when making

decisions. Through stock prediction, investors can

better grasp opportunities, reduce investment risks,

optimize investment strategies, and maximize

returns. Therefore, how to accurately predict stock

trends has become a key issue, and more and more

https://orcid.org/0009-0004-9511-5579

scholars are attracted to engage in research.

Traditional stock analysis methods include

fundamental analysis, technical analysis, quantitative

analysis and so on. These methods have some model

risks, require high accuracy in the trading market, and

have limitations in their scope of application.

Nowadays, with the continuous development of

computer technology, it has become possible for us to

predict stock trends using computer algorithms.

However, due to the impact of various unexpected

situations on stock data, stock data contains some

missing and abnormal values, so it is necessary to

select appropriate models to process the data.

This article starts from the basic theory of stocks,

introduces the basic principles of classification and

regression prediction methods, and then introduces

the effects of machine learning and deep learning

prediction methods on stock trend prediction through

literature. Finally, it analyzes the limitations of using

the above two methods to analyze stock trends, as

well as prospects for future research directions.

Wang, N.

Review of the Application of Algorithms in Stock Prediction.

DOI: 10.5220/0014324700004718

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 2nd International Conference on Engineering Management, Information Technology and Intelligence (EMITI 2025), pages 191-195

ISBN: 978-989-758-792-4

191

2 BASIC THEORIES

The stock is a certificate issued by a joint-stock

limited company to prove the shares held by

shareholders. It indicates that the Stockholders have

ownership of part of the capital of the joint-stock

company. Because stocks contain economic benefits

and can be traded and transferred on the market,

stocks are also a kind of valuable security. The stock

market generates a large amount of trading data every

day. These data often hide a great deal of useful

information, but they are not easily discovered by

people(Wang, 2015).

Stocks have the following basic concepts: The

opening price is the price of the first transaction in the

bidding stage. If there is no transaction, the closing

price of the previous day is the opening price. The

price of the last stock transaction in each day's trading

is the closing price. The highest price among the

transaction prices on that day is the highest price. The

lowest price among the transaction prices on that day

is the lowest price. A quote is the highest purchase

price or the lowest offer made by a trader in the

securities market for a certain security within a

certain period of time. A quote represents the highest

price that both the buyer and the seller are willing to

offer. The purchase price is the price at which the

buyer is willing to buy a certain security, and the offer

is the price at which the seller is willing to sell. The

order of quotations is conventionally that the quoted

price comes first and the quoted price comes later.

Stocks are generally classified into four types:

common stocks, preferred stocks, blue-chip stocks,

and rights issues. Common stock refers to the shares

that enjoy ordinary rights in the operation and

management, profits and property distribution of a

company. It represents the right to claim the

company's profits and remaining assets after meeting

all the requirements for debt repayment and the

income and claim rights of preferred shareholders.

Preferred stock is in contrast to common stock. It

mainly refers to having priority over common stocks

in terms of the rights to profit dividends and the

distribution of remaining assets. Blue-chip stocks are

the stocks of companies with excellent performance

but a relatively slow growth rate. After-rights issue is

a type of stock that is at a disadvantage compared to

common stocks in terms of dividend distribution of

benefits or interest and distribution of remaining

assets. Generally, it is distributed after the distribution

of common stocks, and the remaining benefits are

redistributed.

Algorithmic trading reduces the effect of investor

sentiment and the accumulation of long-term

experience, and is widely applied in the stock

market(Xiao & Wei, 2018). The algorithms for stock

prediction can be roughly divided into two categories:

classification algorithms and regression algorithms.

In stock prediction, the changes in stock prices are

usually classified as "rising", "falling", and

"unchanged". Classification algorithms can provide

prediction results more quickly and are suitable for

real-time decision-making. The commonly used

evaluation indicators include MSE, MAE, RMSE,

R2, etc. Stock prediction usually designs time series

data. Through regression algorithms, the relationship

between historical prices and other related features

can be analyzed to predict future trends. Commonly

used evaluation indicators include accuracy rate,

precision, and recall rate, etc.

3 ANALYSIS OF STOCK

PREDICTION BY DIFFERENT

ALGORITHMS

3.1 Sorting Algorithms

Wang simulation experiments were carried out using

MATLAB. Through the analysis of the experimental

results, the feasibility of wavelet neural networks and

support vector machines in stock prediction was

proved. Wavelet neural network is a theoretical

method that combines wavelet and neural networks.

This method integrates the advantages of both

theories and is an effective stock prediction method

(Wang, 2019).

The author regarded the data of 168 months as a

set of time series. The training set samples were the

data of the first 106 time points, and the test set

samples were the last 62 time points. Training was

conducted respectively using PSO-WNN and PSO-

SVM. Finally, the predicted values output by the

trained PSO-WNN and PSO-SVM are fitted with the

test set data, and the prediction results of the two

models are presented.

From the results, after the training of the PSO-

WNN model, the accuracy rate of the Shanghai

Composite Index was 59.68% and it took 72.3

seconds, and the accuracy rate of the Shenzhen

Composite Index was 56.45% and it took 151.3

seconds. After the training of the PSO-SVM model,

the accuracy rate of the Shanghai Composite Index

was 58.06%, taking 4.54 seconds, and the accuracy

rate of the Shenzhen Composite Index was 64.52%,

taking 4.97 seconds. It can be seen from this that this

model has a relatively high accuracy rate of stock

EMITI 2025 - International Conference on Engineering Management, Information Technology and Intelligence

192

prediction and also has high efficiency. However,

optimization is still needed in terms of the learning

rate and training accuracy of the neural network.

Deng Common technical indicators, as well as

major international indices, the basis between futures

and spot, interest rate markets, exchange rates and

other indicators are incorporated into the

characteristics of the CSI 300 Index. On this basis, a

random forest model is established to conduct index

prediction. The wavelet denoising method commonly

used in signal processing is borrowed to filter out

short-term disturbances, so that the long-term trend

can be reflected. The exponential trend after wavelet

denoising is taken as one of the characteristics of the

random forest(Deng, 2019).

The author selected trading data from a total of

500 trading days from April 16, 2015, to December

18, 2018. The first 100 groups of data were the

training set and then predicted the rise and fall on the

101st day based on the latest market. Then the

training set rolls forward by one day, and this cycle

repeats, with a total of 400 predictions made.

From the analysis of the results, when backtesting

with test samples and considering the trend of

exponential filtering, the prediction accuracy rate is

57%. It is indicated that the random forest model

established based on indicators such as peripheral

stock indices, futures, interest rates, and exchange

rates has a certain predictive ability for the index.

However, from the perspective of the financial

market, the longer the time, the greater the probability

of various events occurring, and the greater the

possibility of market style transformation. That is to

say, the longer the training set, the lower the model's

fitting ability to the data within the sample, and the

greater the model variance.

Long et al. based on the basic idea of the SVDD

algorithm, the fuzzy kernel hypersphere fast

classification algorithm is proposed. The hypersphere

set is found through the merging method, and the

classifier is constructed according to the principle of

maximum membership degree. The overlapping

problem of outliers and hypersphere sets is excluded,

and the complex secondary programming is avoided

at the same time. It has the characteristics of fast

classification speed and high accuracy of

classification results(Long & Zhang, 2014).

The author took the radial basis kernel function as

the experimental kernel function and adopted the

2006 annual report data of 200 listed companies on

the Shanghai Stock Exchange after preprocessing

(t=7) as the training set. The hyperball set was learned

to predict and classify the situations in 2007 and

2008. The classification results were verified for

accuracy by the annual report data for these two

years. The average accuracy rate is the arithmetic

mean of the accuracy rates of the three categories over

the above two years.

From the analysis of the results, the sub-

hypersphere SVM method has an average accuracy

rate higher than that of the classic SVDD improved

algorithm due to the adoption of certain improvement

measures to correct misclassification. This algorithm

plays a good role in optimizing the classifier effect

and improving the classifier performance.

3.2 Regression Algorithm

Li et al. application of linear regression in supervised

machine learning in quantitative trading is studied.

Based on the abu quantitative system, a specific

trading environment is constructed, and the linear

regression module of sklearn is used for simulation

analysis. The feasibility of quantitative trading is

demonstrated to a certain extent according to the

obtained prediction results (Li & Xia, 2023).

The author uses Anaconda3, assuming that the

trading volume and closing price of the day before

yesterday, as well as those of yesterday, are factors

influencing stock prices, and constructs a model. In

the model, it is constructed through the positive and

negative signs of the price difference, volume

difference, and the product of the price difference and

volume difference. Since it is impossible to analyze

all the correct and true characteristic factors, some

noise features are introduced: the price product and

the volume product.

From the analysis of the results, in most cases, the

predicted fluctuation range is roughly consistent with

the actual fluctuation range of the stock price.

However, in the real market, the factors that can

influence stock price trends are complex. The author's

consideration is rather idealized and there is still

considerable room for optimization.

Ma applying two supervised learning algorithms,

namely regression and classification, to predict the

trend of stock prices, good model results have been

achieved. This emphasizes the urgency of

scientifically predicting trends in the stock market,

provides investors with more accurate market

information, and offers a useful reference for future

stock market predictions (Ma, 2024).

The author analyzes the overall trend of stock

price data through K-line chart analysis, then uses a

linear regression model to predict stock prices,

calculates indicators such as mean square error, and

then uses two classification models, logistic

Review of the Application of Algorithms in Stock Prediction

193

regression and random forest, to predict the rise and

fall of stocks and evaluate their accuracy.

Judging from the result, the square value of R2 is

close to 1; Random forest performs outstandingly in

the classification task, with an AUC value of 0.731. It

is concluded that linear regression has a relatively

high accuracy rate in predicting stock prices, and the

random forest is relatively accurate in predicting the

types of stock price rises and falls.

Shen based on the stepwise regression algorithm

and CART decision tree algorithm in data mining

technology, A stepwise regression algorithm based

decision tree is proposed and applied to stock

prediction. Taking the financial indicators of the

annual reports of A-share listed companies as the

analysis object, the stocks are predicted and analyzed

(Shen, 2017).

The author screened out the financial indicators in

the annual reports of 2007 A-share listed companies

in 2013, 2014 and 2015 as the analysis objects,

established the relevant models using SPSS Modeler

software, with 70% of the data as the training set and

30% of the data as the test set. Establish the CART

decision tree algorithm and the grade classification

prediction model, analyze and improve the CART

decision tree model, and finally establish the stepwise

regression algorithm based on the decision tree and

the model of financial indicators of listed

companies(Tao, 2023; Zhang, 2023).

From the results, the correct rate in the test set is

91.05% and the error rate is 8.95%, indicating that the

prediction effect of the CART decision tree algorithm

is relatively good. However, there are numerous

financial indicators. If the author only uses return on

net assets and net assets per share to judge the quality

of stocks, there will be certain errors. Further

optimization is needed to improve the accuracy.

4 CONCLUSIONS

Starting from the basic theory of stocks, this paper

elaborates in detail the core principles of

classification algorithms (such as decision trees,

random forests, support vector machines, etc.) and

regression algorithms (such as linear regression,

support vector regression, etc.), as well as their

specific application scenarios in the prediction of

stock price trends. And through a review of the

literature, presents the academic attempts at stock

prediction starting from classification algorithms and

regression algorithms.

From the analysis of the results, classification

algorithms perform outstandingly in the prediction of

stock rise and fall types. For example, random Forest,

with its ensemble learning characteristics, shows a

relatively high classification accuracy rate in a

complex market environment. Regression algorithms

have significant advantages in stock price prediction.

For example, linear regression models can fit

historical data well. With the continuous development

of machine learning and deep learning technologies,

quantitative investment will further deepen the

integration of technology and applications in the

future, achieving more intelligent, adaptive and

efficient trading decisions, thereby bringing greater

returns and reducing risks to investors.

However, the current research still has limitations:

the risk of model overfitting, the difficulty in

quantifying market emergencies, and the neglect of

macroeconomic or emotional factors. Therefore, I

believe that future research can start from the

following aspects: exploring the combination of deep

learning (such as LSTM, Transformer) and traditional

machine learning algorithms. Combining the theories

of finance and psychology, this paper explores the

influence mechanism of investors' psychological

factors on stock price fluctuations. Introduce more

multi-dimensional data sources to enhance the

model's adaptability to complex market

environments. Future research can explore other

feature extraction methods, such as feature extraction

based on fundamentals and feature extraction based

on news events, to further improve the accuracy and

reliability of predictions.

REFERENCES

Deng, Y. (2019). Based on wavelet denoising and csi 300

index selection strategy of random forest algorithm

[Master’s thesis, Huazhong University of Science and

Technology].

Li, X., & Xia, H. (2023). Stock regression prediction

research based on machine learning algorithm. Journal

of Information Science and Technology, 21(14), 227–

231.

Long, Z., & Zhang, Z. (2014). Application of fast

classification algorithm based on fuzzy kernel

hypersphere in stock prediction. Applications of

Computer Systems, 23(01), 197–201+148.

Ma, J. (2024). Regression and classification of the stock

price trend forecast research. Computer Knowledge and

Technology, 20(12), 12–14+23.

Shen, J. (2017). Stepwise regression algorithm based on

decision tree and its application in stock prediction

[Master’s thesis, Guangdong University of

Technology].

EMITI 2025 - International Conference on Engineering Management, Information Technology and Intelligence

194

Tao, Y. (2023). Research on algorithmic trading strategies

under deep learning [Master’s thesis, North China

University of Water Resources and Electric Power].

Wang, Q. (2015). The application of improved support

vector machine technology in short-term stock price

prediction [Master’s thesis, Chongqing Jiaotong

University].

Wang, Z. (2019). Stock prediction and optimization based

on wavelet neural network and support vector machine

[Master’s thesis, Anqing Normal University].

Xiao, Z., & Wei, Z. (2018). Application of random forest in

stock trend prediction. China Management

Informatization, 21(03), 120–123.

Zhang, B. (2023). Research on improved LSTM stock price

prediction algorithm based on stock comment

sentiment analysis and feature engineering [Master’s

thesis, University of Electronic Science and

Technology].

Review of the Application of Algorithms in Stock Prediction

195