of Linear Regression, Random Forest, RNNs, and
LSTMs, and concludes the key challenges and future
directions in this field.
3.1 Comparison of Machine Learning
Models
Linear Regression: Linear Regression is
straightforward and easy to realize. It gets a superior
performance when the relationships between
variables are approximately linear. However, in
contrast it almost does not work in capturing patterns
present in stock price movements, when they are
complex and non-linear. So, its effectiveness is
strongly limited when facing more intricate
prediction tasks.
Random Forest: Random Forest gets satisfying
performance by aggregating multiple decision trees,
which enhances robustness and reduces overfitting
compared to individual decision trees. But it may still
be overfit with too many trees or irrelevant features.
Random Forest is less suitable for capturing stock
trends because it does not inherently take account for
temporal dependencies in time series data.
Recurrent Neural Networks: RNNs are
designed to handle sequential data and can capture
temporal dependencies, driving them applicable for
time series forecasting. But they may struggle when
facing vanishing and exploding gradient problems
because they increase the difficulty of learning
because of greater complexity when facing long-term
dependencies.
Long Short-Term Memory: It can address many
of the limitations of RNNs by incorporating
mechanisms to manage long-term dependencies and
complex temporal relationships. That is the reason
why they get much better performance in predicting
compared with RNNs. But they are also not perfect,
LSTMs can be very demanding and costly in
resources because they are computationally intensive
and require careful tuning.
3.2 Challenges in Stock Price Prediction
AI Model Interpretability: Many advanced models
all have a very low interpretability which means it is
very hard to understand the algorithm behind the
model and how do they work, such as deep learning
networks, they can be quite similar to "black boxes”,
This lack of transparency can reduce trust in the
results, complicate efforts to improve or adjust the
model when predictions are inaccurate.
Model Generalizability: Models trained on one
stock often struggle to generalize to other stocks
because of significant differences in data
characteristics. So most models are not applicable to
every piece of stock. But training separate models for
each stock is impractical due to high costs and data
collection challenges,so it is necessary to find a way
to improve the adaptability of models to enable them
get satisfying performance in prediction of various
stocks.
3.3 Future Directions
Expert Systems and Explainability Methods: In
order to make it easier to understand how the
prediction is made ,it is essential to adopt some
approaches such as SHapley Additive exPlanations
(SHAP) and Local Interpretable Model-agnostic
Explanations (LIME) . SHAP values help quantify
the impact of each feature on predictions, while LIME
provides simplified, interpretable approximations of
complex models, improving transparency and
understanding.
Transfer Learning and Domain Adaptation:
As for the poor generality of most models, some
techniques can offer promising and practical
solutions. Transfer learning leverages knowledge
from one domain to enhance performance in a related
domain (Weiss, 2016; Zhuang, 2020), while domain
adaptation adjusts models to perform well across
different data distributions. Taking full advantages of
these can save a lot of effort which is originally
wasted on retraining models and facilitate more
effective predictions across diverse stocks.
In summary, while traditional models like Linear
Regression and Random Forest have their uses,
advanced models such as RNNs and LSTMs are
better suited to handle the complexities of stock price
prediction. Addressing challenges related to model
interpretability and generalizability through
innovative methods will be crucial for advancing the
field and achieving more accurate predictions.
4 CONCLUSIONS
This review summarizes the application and progress
of the AI and machine learning in stock price
prediction and mentions many practical application
cases. And introduced the machine learning
workflow, different models such as linear regression,
random forest and their application in predicting
stock prices. In this paper, the advantages and
disadvantages of various model algorithms have been
listed and compared. Different models all have some