The Application of Machine Learning to Algorithmic Trading in

Financial Markets

Kehan Feng

School of Business, McMaster University, Hamilton, Canada

Keywords: Algorithmic Trading; Artificial Intelligence; Predictive Modelling; Market Forecasting.

Abstract: The surge in global digitalization has propelled stock market forecasting into a new era of advanced

technology, transforming traditional trading models. This paper explores the use of Artificial Intelligence (AI)

in algorithmic trading, highlighting its potential to optimize trading strategies, improve forecasting accuracy,

and manage risk. By utilizing the AI techniques such as the deep learning, machine learning and reinforcement

learning algorithms, this study examines how these methods can improve the market forecasting by analysing

the structured and unstructured data. Artificial intelligence-driven trading systems, while promising, face

significant challenges, including model interpretability, applicability in volatile markets, and. To address these

challenges, this paper discusses the importance of integrating interpretable AI tools, as well as the potential

of emerging technologies such as the transfer learning and federated learning. These innovations aim to

improve model transparency, adaptability, and privacy, paving the way for more robust and reliable AI

applications in financial markets.

1 INTRODUCTION

The rise of the global digitalization wave has ushered

stock market forecasting into a new era of high

technology, revolutionizing traditional trading

models. The World Bank reported in 2018 that the

global stock market capitalization has exceeded

$68.654 trillion (WorldBank, 2021). The rising tide

of global digitization has brought stock market

forecasting into a new era of high technology,

revolutionizing traditional trading models. As market

capitalization continues to grow, stock trading has

become a focal point for many financial investors

seeking to optimize their portfolios and maximize

returns. Advanced trading models now enable

researchers to leverage non-traditional text data from

social platforms to predict market trends and

behavior. For example, Frank and Sanai used the

comprehensive news set of S&P 500 corporations

(Murray et al., 2018). By applying sophisticated

machine learning techniques such as text data

analytics and integration methods, the accuracy of

market predictions has improved significantly,

providing investors with deeper insights into potential

market movements.

https://orcid.org/0009-0004-2837-3519

Despite these advances, stock market analysis and

forecasting remain one of the most challenging

research topics in finance due to the inherent

dynamics, instability, and complexity of market data.

Due to its nonlinear, dynamic, stochastic and

unreliable nature of Stock Market Prediction (SMP)

(Tan et al., 2007), traditional algorithmic trading

systems rely heavily on structured market data such

as stock prices and trading volumes, often ignoring

unstructured data that can have a significant impact

on the market. These characteristics require

researchers to constantly innovate and develop new

methods to adapt to the changing market

environment.

Algorithmic trading is a key area of focus in the

evolution of this technology, which relies on complex

mathematical models and high-performance

computer programs to execute trade orders in

milliseconds, thereby capturing fleeting market

opportunities. This approach has shown great

potential to improve trading efficiency, reduce costs

and optimize portfolios. However, the practical

application of algorithmic trading poses unresolved

challenges, particularly in terms of the accuracy and

reliability of Artificial Intelligence (AI) algorithms

when processing market data, identifying trading

Feng, K.

The Application of Machine Learning to Algorithmic Trading in Financial Markets.

DOI: 10.5220/0013264200004568

In Proceedings of the 1st International Conference on E-commerce and Artiﬁcial Intelligence (ECAI 2024), pages 407-411

ISBN: 978-989-758-726-9

407

patterns and managing risk. Traditional algorithmic

trading systems rely heavily on structured market data

such as stock prices and trading volumes. While

effective, this approach tends to ignore unstructured

data such as news sentiment, social media activity,

and other external information that can have a

significant impact on market dynamics. The omission

of unstructured data limits the effectiveness of

existing forecasting models, especially given the

inherent volatility and complexity of financial

markets. As these markets evolve and attract more

attention, there is a growing need for systematic

approaches that integrate a variety of data sources,

including traditional and unconventional inputs.

Research has extensively explored high-

frequency trading and quantitative investing, using

artificial intelligence algorithms to accelerate trading

and optimize portfolio management. Despite this,

most existing algorithmic trading systems continue to

rely heavily on structured market data, with limited

consolidation of unstructured data that can provide

valuable insights. This gap in integrating different

data sources is a key area for future research, as

merging unstructured data can improve the

robustness and accuracy of predictive models. In

addition, existing research highlights the importance

of applying AI to multi-asset class trading. By

examining how to optimize portfolios across different

markets and asset classes, researchers aim to diversify

risk and improve overall returns. Expansion into

multi-asset class trading involves developing

algorithms that can manage complex relationships

and correlations between various assets, a task that

requires sophisticated AI and machine learning

models.

This study aims to explore the application of (AI)

in algorithmic trading, especially how AI technology

can be used to optimize trading strategies, improve

prediction accuracy and reduce risks.

2 METHOD

This section explains conducted literature collection

through various search engines, digital libraries, and

databases, including Google Scholar, ResearchGate,

and Scopus, among others. In the process of literature

collection, keywords and phrases such as "stock

market forecasting method", "quantitative

investment" and "structured market data and

unstructured data" were used to obtain relevant

research results. Through this literature, existing

predictive models and algorithmic trading strategies

can be identified, and their strengths and weaknesses

can be evaluated.

2.1 Introduction of Machine Learning

Workflow

A machine learning workflow shown in Figure 1

involves several critical stages, each essential for

developing effective predictive models. It begins

with Data Collection, where structured and

unstructured data relevant to the problem is gathered

from various sources. This data is then subjected

to Data Pre-processing, which includes cleaning,

normalization, and feature engineering to ensure it is

suitable for model input. The next step is Model

Selection and Development, where appropriate

algorithms are chosen based on the problem's nature

and data characteristics. This is followed by Model

Building, where the selected algorithms are

implemented and designed for training. Model

Training involves feeding the prepared data into the

model, allowing it to learn and optimize its

parameters. Finally, Model Testing evaluates the

model’s performance using a separate dataset,

ensuring its accuracy and generalization ability

before deployment.

Figure 1: The typical machine learning workflow (Photo/

Picture credit: Original).

ECAI 2024 - International Conference on E-commerce and Artiﬁcial Intelligence

408

2.2 Quantitative Investment

2.2.1 PPO Algorithm

Proximal Strategy Optimization (PPO) is a

reinforcement learning algorithm that optimizes

trading strategies by directly adjusting them to

maximize expected returns while maintaining

stability through restricted updates. It is particularly

useful in algorithmic trading and portfolio

management, where strategies can be dynamically

adjusted according to market conditions. The

shearing mechanism of PPO ensures the

controllability of strategy changes and reduces the

risk of instability during training, making it an

effective tool for developing risk-aware and profit-

optimizing trading agents. Therefore, this study

chooses the model-free algorithm that learns under

the assumption that the transition probabilities are

unknown. Among the model-free algorithms, the

policy optimization algorithm that performs better

than the Q-learning algorithm for continuous

behavior and high dimensional data is selected

(Brockman et al., 2016; Duan et al., 2016). Finally,

among the policy optimization algorithms, the PPO

algorithm, which outperforms the other algorithms in

terms of implementation, simplicity and

performance, is selected (Schulman et al., 2017).

2.2.2 Risk Management and Hedging

In quantitative investing, risk management and

hedging strategies are central to ensuring portfolio

stability and optimizing returns. Quantitative

investing relies on mathematical models and

algorithms to identify, assess and control risk through

a variety of tools. Among them, volatility

management and value-at-risk (VaR) are commonly

used risk management methodologies. var is widely

used by most trading organizations to track the risk of

their market portfolios and to help supply chain

managers quantify the potential risk of "what-if"

scenarios (Saita, 2007). Volatility management helps

investors reduce risk exposure when market volatility

rises, or increase return potential when volatility falls,

by measuring how much asset prices or portfolio

returns move. Historical volatility provides a picture

of past market volatility, while implied volatility

reflects the market's expectation of future uncertainty.

VaR, on the other hand, is used to estimate the

maximum potential loss that could occur in each time

period, providing a basis for investors to set stop-loss

points or adjust their investment strategies.

Hedging strategies also play an important role in

quantitative investing. Through derivatives such as

futures and options, investors can effectively hedge

the risk of a particular asset or the market. For

example, market-neutral strategies reduce the impact

of market volatility on investment portfolios by

simultaneously going long and short the underlying

assets; statistical arbitrage utilizes historical

relationships between assets to conduct hedging

transactions.

The dynamic risk-adjustment capability of

quantitative investing is one of its significant

advantages. By automating the adjustment of risk

exposures, quantitative investment strategies can

respond in a timely manner to changes in market

conditions, reducing risk exposure or capturing more

return opportunities. These methods not only enhance

the science of investment decision-making, but also

strengthen the resilience of investment portfolios in

complex market environments, ensuring that

investors effectively control risks while pursuing

returns.

2.3 Stock Price Prediction

2.3.1 Deep Neural Network-Based

Prediction

The method utilizes deep learning techniques to

extract complex patterns from large amounts of

historical data to predict future stock prices. Deep

neural networks are constructed through multiple

layers of neurons with strong nonlinear mapping

capabilities. Commonly used network architectures

include fully connected networks, convolutional

neural networks (CNN), and recurrent neural

networks (RNNs), of which long short-term memory

networks (LSTMs) and gated recurrent units (GRUs)

are particularly suited for processing time series data.

The process of stock price prediction usually includes

data preprocessing, model training and prediction

evaluation. Data preprocessing includes

normalization, standardization and feature extraction

to improve model training. The model training stage

uses a large amount of historical data to adjust the

model parameters and minimize the prediction error

by selecting appropriate loss functions and

optimization algorithms. The trained model can be

used to predict future stock prices, and the results

need to be post-processed (e.g., inverse

normalization) to obtain the actual stock prices.

DNN-based stock price prediction offers

significant advantages, including capturing complex

nonlinear patterns and automatic feature learning

without the need for manually designed features.

However, these approaches also face challenges, such

as the need for large amounts of data, the risk of

overfitting, poor model interpretability, and high

computational resource requirements. The “black

box” nature of deep neural networks makes their

The Application of Machine Learning to Algorithmic Trading in Financial Markets

409

decision-making process difficult to interpret, which

may pose difficulties in financial decision-making.

Therefore, although DNNs perform well in stock

price prediction, these challenges need to be

addressed to improve the accuracy and stability of

predictions.

2.3.2 LSTM-Based Prediction

Stock price prediction based on Long Short-Term

Memory Networks utilizes the LSTM model in deep

learning to analyses and predict the future movements

of stock prices. LSTM is a special type of recurrent

neural network (RNN) designed to process and learn

long-term dependencies in time-series data, which

overcomes the problems of gradient vanishing and

gradient explosion faced by traditional RNNs in long-

series data. LSTM is able to effectively capture

complex patterns in stock price data, including

seasonal and long-term trends, which makes it

particularly suitable for time series forecasting. Its

memory mechanism allows the network to maintain

and update information over a longer period of time,

thus enhancing the prediction of stock price

movements. By stacking multiple LSTM layers, the

model can learn deeper features in the data to further

improve prediction accuracy.

When applying LSTM for stock price prediction,

it usually includes the following steps: first, the data

preparation stage requires collecting and cleaning

historical stock price data, including processing

missing values and normalization. Next, the LSTM

model is constructed, the network structure is

designed, and appropriate hyperparameters are

selected. When training the model, the network

weights are optimized by historical data to minimize

the prediction error. The trained model needs to be

evaluated to verify its prediction performance using

metrics such as Mean Square Error (MSE) and to

check the generalization ability of the model through

cross-validation. Finally, the model is deployed for

real-time forecasting with continuous monitoring and

retraining to adapt to market changes.

3 DISCUSSIONS

In the current research on stock market forecasting

and algorithmic trading, AI technologies still face

many challenges and limitations despite

demonstrating their potential. First, model

interpretability is a notable issue. Complex AI

models, especially deep learning networks such as

LSTM, are often viewed as "black boxes," making it

difficult for investors and regulators to understand

their decision-making process. This lack of

transparency not only reduces trust in AI-driven

strategies but may also pose challenges in terms of

regulatory compliance. In addition, AI models excel

in laboratory environments, but uncertainty remains

about their applicability in real, highly volatile

financial markets. Models that rely on historical data

for training often perform poorly in the face of

unprecedented market events or volatility, leading to

a lack of model generalization capabilities, thus

limiting their practical application value. While AI

models show promise in controlled environments,

their applicability in real-world, highly volatile

financial markets remains uncertain. Models trained

on historical data may perform poorly in the face of

unprecedented market events or changes. On the

other hand, relying on historical data to train AI

models may lead to overfitting, in which case the

model performs very well on past data but fails to

generalize to new, unseen scenarios. This limitation

hinders the practical application of AI in dynamic

market conditions.

Quantitative Investment (QI) has demonstrated

unique advantages in relying on mathematical models

and algorithms to make investment decisions in a

data-driven manner. However, these models also face

multiple limitations and challenges. Firstly, model

overfitting is a major issue, especially when relying

too much on historical data during training, leading to

unsatisfactory performance in real markets. In

addition, the effectiveness of quantitative investing

relies on the quality of the data, and any errors or

noise in the data may trigger wrong investment

decisions. The changing dynamics of financial

markets also pose a challenge to quantitative models,

as sudden market events or policy changes may

invalidate models based on past data.

Another key challenge is the transparency and

explanatory nature of models, especially when

complex machine learning algorithms are used, which

can make it difficult for investors and regulators to

understand the model's decision-making process.

The popularity of algorithmic trading could also lead

to increased market volatility and even trigger

phenomena such as flash crashes. The high demand

for technological infrastructure, on the other hand,

means that building and maintaining these models

requires powerful computing power and high costs,

which can be challenging for small investment

organizations or individual investors. In addition, as

regulators increase their focus on algorithmic trading,

compliance issues may limit the use of certain

strategies or increase the complexity of

implementation. Finally, quantitative models may

perform well at small scales, but market impact and

ECAI 2024 - International Conference on E-commerce and Artiﬁcial Intelligence

410

slippage issues may erode their returns when

operating with large-scale capital.

In the future development of AI technology,

several emerging approaches are expected to

significantly improve the limitations of current

algorithmic trading systems. First of all, Expert

systems, SHAPLE Additive explanations (SHAP)

and Local Interpretable Model-agnostic Explanatory

AI tools such as Explanations (LIME) will play an

important role in improving model transparency and

explainability. By shedding light on the basis on

which model decisions are made, these tools can help

not only bolster investor and regulator trust in AI-

driven strategies, but also help identify potential

model biases and risks (Linardatos et al., 2020). This

will lead to the evolution of AI models from "black

boxes" to more explanatory and transparent, allowing

them to be more widely used in practical financial

decisions.

In addition, transfer learning and domain

adaptation techniques can enhance the adaptability of

AI models (Ma et al., 2024; Weiss et al., 2016). These

approaches accelerate model deployment in new

markets by reusing knowledge from existing models

in new domains and reducing reliance on large-scale

data and computing resources. This is particularly

critical for dealing with dynamic changes and

unpredictability in financial markets, as models can

quickly adapt to new market characteristics,

improving their robustness and extensiveness for

practical applications. These future directions not

only provide potential solutions to the current

challenges of AI technology, but also lay the

foundation for innovation in algorithmic trading

systems

4 CONCLUSIONS

This paper explores the use of AI in algorithmic

trading, revealing its potential to optimize trading

strategies, improve forecast accuracy, and reduce

risk. However, despite the demonstrated power of AI

technology in financial markets, its widespread use

still faces many challenges, such as model

interpretability, applicability. Through an in-depth

analysis of these challenges, this study emphasizes

the need to develop more explanatory and transparent

AI tools, such as SHAP and LIME, to enhance the

confidence of investors and regulators. These

approaches not only extend the application scope of

AI models, but also improve their adaptability and

security in complex financial environments. In the

future, with the further development and application

of these technologies, AI is expected to play a more

important role in algorithmic trading and drive the

digital transformation of financial markets.

REFERENCES

Brockman, G., Cheung, V., Pettersson, L., Schneider, J.,

Schulman, J., Tang, J., et al. 2016. OpenAI gym. arXiv,

arXiv:1606.01540.

Duan, Y., Chen, X., Houthooft, R., Schulman, J., & Abbeel,

P. 2016. Benchmarking deep reinforcement learning for

continuous control. In Proceedings of the International

Conference on Machine Learning (pp. 1329-1338).

Frand, M. Z., & Sanati, A. 2018. How Does the Stock

Market Absorb Shocks. Journal of Financial

Economics, 136-153.

Linardatos, P., Papastefanopoulos, V., & Kotsiantis, S.

2020. Explainable ai: A review of machine learning

interpretability methods. Entropy, 23(1), 18.

Ma, Y., Chen, S., Ermon, S., & Lobell, D. B. 2024. Transfer

learning in environmental remote sensing. Remote

Sensing of Environment, 301, 113924.

Saita, F. 2007. Value at Risk and Bank Capital Management

(pp. 1-5). Boston: Academic Press.

Schulman, J., Wolski, F., Dhariwal, P., Radford, A., &

Klimov, O. 2017. Proximal policy optimization

algorithms. arXiv, arXiv:1707.06347.

Tan, T. Z., Quek, C., & Ng, G. S. 2007. Biological Brain-

Inspired Genetic Complementary Learning for Stock

Market and Bank Failure Prediction. Computational

Intelligence, 23, 236-261.

Weiss, K., Khoshgoftaar, T. M., & Wang, D. 2016. A

survey of transfer learning. Journal of Big data, 3, 1-40.

World Bank. 2021. Market Capitalization of Listed

Domestic Companies (Current US$) Data. Retrieved

May 19, 2021, from https://data.worldbank.org/

indicator/CM.MKT.LCAP.CD

The Application of Machine Learning to Algorithmic Trading in Financial Markets

411