making it a useful tool for companies for decision
support.
2 PROBLEM STATEMENT
However, our ever-increasing reliance on social
media as a transparent source of real time public
opinion is pushing today’s sentiment analysis and
trending systems to cope with the vast volume,
velocity, and linguistic heterogeneity of social media
data. Currently employed methods suffer from
several limitations: they use static data as input, they
are not multilingual, they are not scalable, and lack of
context does not allow for providing accurate and
timely insight. Lack of an end-to-end real-time
platform that bridges state-of-the-art natural language
processing with big data infrastructure leaves a void
in leveraging social media to its full potential in
predictive analytics. We are going to fill this gap in
this research by creating a real-time, multilingual and
scalable sentiment mining system that can generate
actionable insights and predict currents trends on the
fly on dynamic social platforms.
3 LITERATURE SURVEY
Recent years have seen growing interest in the use of
social media as a source of information on public
sentiment and event forecasting, and the rise of the
research that integrates natural language processing
(NLP) and big data technologies. Albladi et al.
(2025), TWSSenti, a combined approach that utilizes
transformer-based models to perform topic-wise
sentiment classification, however, it does not support
multilingual and informal language. Nurlanuly
(2025) presented a model for sentiment analysis
system using traditional machine learning methods
and the system only supported static dataset which is
not applicable in real-time. Camacho-Collados et al.
(2022) designed TweetNLP, providing state-of-the
art functionalities for social media text processing,
but limited deployment due to complexity for large
scale applications.
A number of trend analysis reports in the industry
(including from Clark, 2024; Hootsuite, 2025;
Talkwalker, 2025; and Varga, 2025) emphasized that
real-time analysis of sentiment plays an increasingly
critical role in market and social intelligence.
However, such announcements, generally, are not
backed by facts and details of how they would be
implemented. ResearchGate publications (2025)
discuss the integration concepts of AI and NLP with
respect to public opinion analysis but do not provide
detailed evaluations. ScienceDirect (2025), on the
other hand, presents a number of more down-to-earth
papers, such as on emotion recognition, quick
sentiment-based impact measurement, and prospects
in current NLP methods (ScienceDirect, 2025a;
2025b; 2025c).
Wiley (2021) studied Twitter trend analysis
through big data analytics, where features are mainly
hashtag-based and lack semantic context (e.g.,
meaning). Springer (2024) discussed cross-platform
sentiment analysis model comparison, but found that
differences in linguistic and domain did not yield the
same accuracy across models. The significance of big
data infrastructure is further emphasized by
ResearchGate (2025), in criticism of traditional NLP
systems being poorly integrated with main big data
platforms such as Apache Spark or Hadoop.
Practitioner point of views on sentiment mining
problems related to sarcasm identification, detector
multilinguality issues and noise elimination were also
aggregated from LinkedIn (2025) and AI Multiple
(2025 filters. Yet these findings need empirical
support. Study on sentiment assessment by (2025d)
ScienceDirect It was discovered that the existing
lexicons are still in control of the benchmarks for
quality of performance. ResearchGate (2025c)
analyzed US market sentiment trends and was not
transferable. The Journal of Computer Science
Applications (2025) investigated sentiment mapping
for community engagement, whereas CEPR (2025)
investigated Twitter sentiment based on financial
forecasting.
Business Insider (2024) and Project Pro (2025) also
drew attention on the developing power of social
sentiment on physical events, but had no
frameworks. ITM Conferences (2025) also discussed
future of sentiment analysis and stressed out need of
scalable & adaptive solutions.
Combined with industry feedback, these papers
and observations point out the deficiencies of existing
technology and call for an entity-based, real-time,
multilingual and scalable sentiment mining
framework, which can leverage deep NLP and big
data processing techniques to predict trends and
extract social insights effectively.
4 METHODOLOGY
In this work, we present a real-time, multilingual
system designed for sentiment analysis, event
prediction based on breaking news, which combines