
pervised learning setting to classify whether the stock
will significantly increase, decrease, or remain stable
on the next day.
The main contributions of this study are as fol-
lows: (1) we leverage a large language model (Meta’s
Llama 3.1) to preprocess tweet content before apply-
ing emotion analysis, which improves the quality of
extracted emotional features; (2) we propose a novel
framework for stock price significant movement pre-
diction that integrates emotion features derived from
tweet data with daily stock price data; and (3) We sys-
tematically compare the predictive performance of a
baseline model that uses only historical stock prices
with three models that combine stock price data and
emotion features from 3 different emotion analysis
methods: the DistilRoBERTa-based Emotion Classi-
fication Model, the NRC Emotion Intensity Lexicon,
and the NRC Emotion Label Lexicon. Each emo-
tion analysis method is evaluated with and without
LLaMA-enhanced emotion analysis.
The remainder of this paper is organized as fol-
lows. Section 2 reviews related work on stock predic-
tion and the use of sentiment and emotion analysis
in financial contexts. Section 3 describes our pro-
posed methodology, including LLaMA-based tweet
preprocessing, emotion analysis on tweet data, and
the construction of the final dataset to predict signif-
icant stock price movements. Section 4 presents the
experimental results and evaluates the effectiveness of
three emotion analysis methods, both with and with-
out LLaMA-enhanced emotion analysis, in improv-
ing the accuracy of predicting significant stock price
movements. Finally, Section 5 concludes the paper
and discusses possible directions for future work.
2 RELATED WORK
Stock market prediction remains a highly active area
of research in both academia and industry. Over the
decades, numerous predictive models have been pro-
posed to address the inherent complexity and non-
linearity of financial time series.
In the 1990s, artificial neural networks (ANNs)
were the most widely used models for stock market
prediction (Atsalakis and Valavanis, 2009). For in-
stance, Kimoto et al. (Kimoto et al., 1990) imple-
mented a modular neural network model using his-
torical stock prices, technical indicators, and macroe-
conomic variables as input features to predict one-
month-ahead movements of the Tokyo Stock Price In-
dex (TOPIX). In the following decade, support vec-
tor machines (SVM) (Huang et al., 2005) and sup-
port vector regression (SVR) (Huang and Tsai, 2009)
emerged as popular alternatives, offering improved
generalization and robustness by leveraging the struc-
tural risk minimization principle. Specifically, Huang
et al. (Huang et al., 2005) applied a support vector
machine (SVM) to predict the directional movement
of the NIKKEI 225 Index using macroeconomic data,
and demonstrated that SVM outperformed artificial
neural networks (ANNs) in classification accuracy. In
the past decade, deep learning techniques have gained
increasing attention in financial forecasting. Among
them, Long Short-Term Memory (LSTM) networks
and their variants have been widely adopted for stock
market prediction (Jiang, 2021). A particularly rep-
resentative work is that of Nelson et al. (Nelson et al.,
2017), who applied an LSTM-based model to predict
stock price movement of Brazilian stocks and demon-
strated that it achieved significantly higher accuracy
compared to four traditional machine learning mod-
els.
Recent studies have increasingly incorporated ex-
ternal textual sources such as financial news, social
media, and web searches. A common method to
process this unstructured data is sentiment analysis,
which determines whether the content reflects a pos-
itive or negative outlook on the market (Balaji et al.,
2017). In 2011, Bollen et al. (Bollen et al., 2011)
used sentiment analysis to extract mood from Twitter
data and integrated these features with daily DJIA-
closing values to predict the movement of the Dow
Jones Industrial Average (DJIA). Similarly, Nguyen
et al. (Nguyen et al., 2015) proposed a hybrid model
that integrates sentiment features extracted from fi-
nancial message boards with lagged stock prices to
predict daily stock movement.
In recent years, emotion analysis has been widely
applied to domains such as healthcare, education,
marketing, and finance, with a primary focus on ana-
lyzing text from online social media, review systems,
and conversational agents (Kusal et al., 2022) (Kusal
et al., 2022). For example, Mackey et al. (Mackey
et al., 2021) proposed a fake news detection frame-
work that combines emotional features extracted from
news articles with BERT embeddings. Their model
incorporates discrete emotion vectors such as anger,
trust, and joy, as well as continuous emotion di-
mensions including valence and arousal, to improve
the classification of misinformation into categories
such as satire, hoax, propaganda, and clickbait. In
the context of stock market prediction, the applica-
tion of text-based emotion analysis remains relatively
limited. Among the few existing studies, Chun et
al. (Chun et al., 2022) proposed an emotion-based
stock prediction system that integrates multiple emo-
tional categories including joy, interest, surprise, fear,
Predicting Stock Price Movement with LLM-Enhanced Tweet Emotion Analysis
233