prediction and also has high efficiency. However,
optimization is still needed in terms of the learning
rate and training accuracy of the neural network.
Deng Common technical indicators, as well as
major international indices, the basis between futures
and spot, interest rate markets, exchange rates and
other indicators are incorporated into the
characteristics of the CSI 300 Index. On this basis, a
random forest model is established to conduct index
prediction. The wavelet denoising method commonly
used in signal processing is borrowed to filter out
short-term disturbances, so that the long-term trend
can be reflected. The exponential trend after wavelet
denoising is taken as one of the characteristics of the
random forest(Deng, 2019).
The author selected trading data from a total of
500 trading days from April 16, 2015, to December
18, 2018. The first 100 groups of data were the
training set and then predicted the rise and fall on the
101st day based on the latest market. Then the
training set rolls forward by one day, and this cycle
repeats, with a total of 400 predictions made.
From the analysis of the results, when backtesting
with test samples and considering the trend of
exponential filtering, the prediction accuracy rate is
57%. It is indicated that the random forest model
established based on indicators such as peripheral
stock indices, futures, interest rates, and exchange
rates has a certain predictive ability for the index.
However, from the perspective of the financial
market, the longer the time, the greater the probability
of various events occurring, and the greater the
possibility of market style transformation. That is to
say, the longer the training set, the lower the model's
fitting ability to the data within the sample, and the
greater the model variance.
Long et al. based on the basic idea of the SVDD
algorithm, the fuzzy kernel hypersphere fast
classification algorithm is proposed. The hypersphere
set is found through the merging method, and the
classifier is constructed according to the principle of
maximum membership degree. The overlapping
problem of outliers and hypersphere sets is excluded,
and the complex secondary programming is avoided
at the same time. It has the characteristics of fast
classification speed and high accuracy of
classification results(Long & Zhang, 2014).
The author took the radial basis kernel function as
the experimental kernel function and adopted the
2006 annual report data of 200 listed companies on
the Shanghai Stock Exchange after preprocessing
(t=7) as the training set. The hyperball set was learned
to predict and classify the situations in 2007 and
2008. The classification results were verified for
accuracy by the annual report data for these two
years. The average accuracy rate is the arithmetic
mean of the accuracy rates of the three categories over
the above two years.
From the analysis of the results, the sub-
hypersphere SVM method has an average accuracy
rate higher than that of the classic SVDD improved
algorithm due to the adoption of certain improvement
measures to correct misclassification. This algorithm
plays a good role in optimizing the classifier effect
and improving the classifier performance.
3.2 Regression Algorithm
Li et al. application of linear regression in supervised
machine learning in quantitative trading is studied.
Based on the abu quantitative system, a specific
trading environment is constructed, and the linear
regression module of sklearn is used for simulation
analysis. The feasibility of quantitative trading is
demonstrated to a certain extent according to the
obtained prediction results (Li & Xia, 2023).
The author uses Anaconda3, assuming that the
trading volume and closing price of the day before
yesterday, as well as those of yesterday, are factors
influencing stock prices, and constructs a model. In
the model, it is constructed through the positive and
negative signs of the price difference, volume
difference, and the product of the price difference and
volume difference. Since it is impossible to analyze
all the correct and true characteristic factors, some
noise features are introduced: the price product and
the volume product.
From the analysis of the results, in most cases, the
predicted fluctuation range is roughly consistent with
the actual fluctuation range of the stock price.
However, in the real market, the factors that can
influence stock price trends are complex. The author's
consideration is rather idealized and there is still
considerable room for optimization.
Ma applying two supervised learning algorithms,
namely regression and classification, to predict the
trend of stock prices, good model results have been
achieved. This emphasizes the urgency of
scientifically predicting trends in the stock market,
provides investors with more accurate market
information, and offers a useful reference for future
stock market predictions (Ma, 2024).
The author analyzes the overall trend of stock
price data through K-line chart analysis, then uses a
linear regression model to predict stock prices,
calculates indicators such as mean square error, and
then uses two classification models, logistic