Fake News Detector in Social Media Using AI
K. E. Eswari, Harshini S, Kaaviyadharshini M and Kavinnagul S
Department of Computer Science and Engineering, Nandha Engineering College Erode,
Tamil Nadu, India
Keywords: Fake News, Block chain, and Natural Language Processing (NLP).
Abstract: Natural language processing (NLP), blockchain technology, and reinforcement learning (RL) are all
employed within this new method for detecting fake news. It begins by gathering a large set of news articles
and accompanying data, which is then cleaned and tokenized through the use of natural language processing
(NLP) methods. A key feature set that is extracted, including word counts and readability scores, is used to
train an RL agent. The agent is provided with the skills to distinguish authentic and false news through a
reward- punishment framework. In the post-training process, the RL agent makes decisions based on these
features in determining whether novel articles are valid or fake ones. While blockchain technology's
operation is described, more details must be provided. This approach works to prevent false and misleading
content dissemination in digitalnews.The explosive expansion of online social networks within the last few
years has promoted the spread offake news for political and commercial purposes. The consumers of these
sites are easily influenced by fake news using deceitful language, which impacts offline societygreatly.
Detection offake news quicklyis one of the key objectives in enhancing the accuracy of information on
online social networks. The algorithms, methods, and rules for detection of false news stories, authors, and
entities in web-scale social networks are investigated in this work, in addition to measuring their efficacy.
The sheer magnitude of web-scale data complicates detection, estimation, and correction offtake news,
particularly considering the increasing importance of correct information, particularly social media. In this
paper, we introduce a method of identifying false information andtalkabout howtoapplyit on Facebook, one
of the most used social networking websites.Rather than relying on typical news websites that typically
involve source verification, most smartphone users prefer to read news on social networking sites.
Authenticating news and articles posted on social networking platforms such as Facebook, Twitter,
WhatsApp, and other microblogs is challenging, though. If rumors are used as factual news, it is not good
for society. Emphasis on verifying and sharing authentic, verified news is the need of the time in countries
like India where false news may spread easily. In this study, a machine learning andnatural language
processing-based model and methodology for detecting fake news are introduced. The technique employs a
Support Vector Machine (SVM) tocompilenews reports and evaluate their validity. Compared to other
models, the performance of this proposed model with 93.6% accuracy demonstrates how efficiently it can
identify false news.
1 INTRODUCTION
Un supervised model-based on fake information
detection is an important novel approach to
combating epidemic spreading of misinformation in
today's cyber-age digital world (Augenstein, T et al.,
2020). However, with the rise of social media and
other online platforms, the issue of the spread of
false or misleading information has risen to
prominence since it could be harmful for public
debate, democracy and public safety. Unsupervised
fake news detection methods rely on inherent
pattern and features of textual data so as to be able to
distinguish between generated news and real news
without the support of pre-label training data (P.
Rosso et al., 2017). These models attempt to
automatically detect fake news and even dangerous
content through techniques such as natural language
processing, clustering, and anomaly detection. This
provides a proactive and scalable mechanism to
address the epidemic of misinformation.
788
Eswari, K. E., S., H., M., K. and S., K.
Fake News Detector in Social Media Using AI.
DOI: 10.5220/0013905800004919
Paper published under CC license (CC BY-NC-ND 4.0)
In Proceedings of the 1st International Conference on Research and Development in Information, Communication, and Computing Technologies (ICRDICCT‘25 2025) - Volume 3, pages
788-793
ISBN: 978-989-758-777-1
Proceedings Copyright © 2025 by SCITEPRESS Science and Technology Publications, Lda.
1.1 Natural Language Processing
(NLP)
A branch of software engineering referred to as
"natural language processing"(NLP)is intended to
provide computers with the capacity to understand
written and verbal languagein a manner that
simulates human capability. NLP combines
computational etymological rule-based human
language presentation with facts, artificial
intelligence, and sophisticated learning models. With
these developments, computers are now able to fully
"comprehend" human language as messages or
auditory data, including the expectations and
perspectives of the writer or speaker. Computer
programmers use natural language processing (NLP)
to reply to verbal requests, translate text from one
language to another, and summarize huge amounts
of information quickly and even continuously.
Natural language processing (NLP) is utilized by
virtual assistants, customer service chatbots, speech-
to-message transcription applications, voice-
controlled GPS navigation and other shopping
conveniences. But NLP also plays a key role in
massive commercial initiatives that enhance
important corporate operations, enhance employee
efficiency, and automate operations.
1.2 Block Chain Technology
Blockchain technology can be employed to safely
store information, making it extremely hard, if not
impossible, for fraud, hacking, or system alteration
to occur. A block chain is simply a computer
network that spreads and replicates a digital record
of transactions across the network. Each new
transaction is replicated into the ledger of each
member, and there are a number of transactions in
every block of thechain. Several
usersmanagethisdistributeddatabasesystem, referred
to as Distributed Ledger Technology (DLT). A
constantly expanding set of immutable,
cryptographically locked transactional records
shared by all the users of a network is referred to as
a blockchain (T. Cignarella et al., 2020). Authorized
individuals are able to trace any transactional event
back to its historical moment due to the time stamps
and pointers to earlier transactions in every record.
One specific application of the general concept of
networked ledgers is blockchain.
1.3 Fake News
The spread off fake news has appeared as a serious
and alarming issue in the age of digital information
and social media. Deliberately in corrector
misleading information reported in the guise of
genuine news with the aim to deceive or manipulate
popular thinking is refer red to as fake news.
Unsubstantiated or obviously erroneous stories
spread quickly because information on the internet
can be circulated with such ease, resulting in the
creation of false news (B. Riedel et al., 2018). This
ultimately results in miscommunication, strife, and
the possibility of influencing people's opinions and
choices. Theubiquityofmisinformation is
aseriousrisktoinformationaccuracyinthetechnological
era, raising essential questions regarding media
literacy, ethical journalism, and the implications of
disinformation on society S. Ochoa et al., With
information being more accessible than ever, this
world-wide problem has implications for public
health, politics, and other elements of society.
2 RELATED WORK
With the enormous increase of communication
technologies and intelligent devices, the data traffic
also increased dramatically and huge volume of data
has been generated per second from different
applications, users and devices. For this reason, the
need for methods to address changes in data over
time or concept drifts has been growing. El Stream
Ahmad Abbasi et al proposed the El Stream, a novel
technique that combined real and synthetic data, and
both ensemble and nonensemble classifier in both
on-line and off-line stages to identify concept drifts
J. Vijay et al., (2021). While processing, in decision-
making, El Stream uses a majority vote to choose
the best classifier. Experimental results indicate that
our ensemble learning approach performs well better
than conventional machine learning algorithms and
state-of-the-art approaches in both simulated and
real datasets in terms of higher accuracy. In the past
decade, there has been a tremendous amount of
interest in big data, due to potential benefits that can
be gained from valuable insights and advantages,
including cost saving, quicker decision-making and
innovations in varied applications. However, the
analysis of such data is hard if it is delivered to the
analyst as a continuous stream (D. Mouratidis,
2021).
The amplified utilization of social networking sites
has taken fake news from zero to sixty in the modern
era. It is important to validate information available
on social network websites, including valid sources,
however with online news, there is no way to
Fake News Detector in Social Media Using AI
789
confirm source veracity. In this paper, we introduce
the FNU-Bi CNN model for data pre-processing
using the basic features from NLTK like stop words
and stemming. We also use batch normalization,
dense layers, LSTM, and WORDNET Lemmatizes
to calculate TF-IDF and select the features. The data
sets are trained by Bi-LSTM with ARIMA and
CNN then classification performed by using several
Machine Learning Algorithms. This method
constructs an ensemble model which captures news
article, author and title representations from text data
to derive credibility scores. We benchmark two
data classifiers including SVM, DT, RF, KNN,
Naive Bayes and K-NN in an effort to maximize the
prediction accuracy. Chang Li et al. Put forth the
suggestion that we argument provide rich evidence
form an if old views (Y. Wang et al., 2020). Yet, it is
hard to comprehend the positions within the
discussions because modeling both textual content
and user interactions is required. Current methods
typically dis regard the connection between various
issues of argumentation and favor a general
categorization strategy. In this paper, we consider
the issue as a collaborative representation learning
problem in which we embed authors and text based
on their interactions. We evaluate our model on the
Internet Argumentation Corpus and compare various
structural information embedding methods.
Experiment results show that our model performs
superior to competitive models. Social media
platforms have become increasingly important
powerful forces on political debates,
allowinguserstoexpresstheirvoicesandinteractwithco
ntrasting opinions. This leads to examination of
public opinion, political rhetoric, and argument
forms, calling for extensive research to find out how
argumentation dynamic works and writers interact
with whattheywrite. U mar Mohammed A bacha et
al. broke new ground in researching report grouping,
an elementary task in computer programming and
database administration Chokshi and R. Mathew’s,
(2021). It is a process of classifying papers in to
some classes, a basic process fin formation
classification because the number of reports
continues to rise with the rise in personal computers
and technology. Classification of such papersbased
on their content is essential. Text classification is
widely used to classify text into different categories
and involves a number of steps, each category
having a proper method enhancing the performance
in processing. Effective content-based classification
is essential for data experts and researchers and is an
important role in handling and sorting through
massive datasets (C. Dulhanty et al., 2019). Aparna
Kumari et al. introduced a newfeature selection
technique employed with a real dataset. This
methodology develops attribute subsets based on
two factors: (1) selecting discriminantattributes with
high classifying abilityand distinct from one another,
and (2) ensuring that the attributes in the sub set
complement each other by correctly classifying
distinct classes. The process uses confusion matrix
data to consider each attribute independently. It is
necessary to choose attributes with high
discrimination power, especially in the case of large
datasets, like brain MRI scans, where feature
selection significantly impacts classification
performance. As data get sparser when the number
of features rises, more training data are required to
effectively describe high-dimensional datasets,
leading to the" curse of dimensionality."
2.1 Previous Research
Individuals today use social media for consumption
and spreading of news to a larger extent, which is
the primary reason for the spread of both genuine
and fake news throughout the nation. Spread of fake
news on platforms like Twitter is a significant
danger to society. One of the major challenges to an
effective identification of false news on platforms
like Twitter is sophistication in distinguishing
between accurate and false content. Scientists have
managed this by focusing on methods of fake news
detection. Thestudy will utilize the FNC-1 dataset,
which has four features for identifying fakenews.
Wewillutilizebigdatatechnology (Spark) and
machine learning to compare and analyze the latest
techniques for detecting fake news. The approach
involvesemployingadecentralizedSparkclustertodeve
lop a stacked ensemble model.
3 PROPOSED METHODOLOGY
The engineering that the solution to fake news relies
on is a mixture of blockchain, reinforcement
learning (RL), and natural language processing
(NLP). The workflow collects a vast volume of
news articles and also metadata, such as the author,
date and source. In the pre- processing step, the
collected data are tokenized and cleaned by NLP
techniques. Sentence length, readability and word
frequency constitute, in turn, features derived from
the processed text. These features are used as
training data to the RL agent, which learns about the
patterns that separate real news and false news.
When trained, the agent can then check whether it is
ICRDICCT‘25 2025 - INTERNATIONAL CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION,
COMMUNICATION, AND COMPUTING TECHNOLOGIES
790
true or fake for a new news story on the extracted
features.
3.1 News organization
Technologies such as blockchain and NLP can help
to curb misinformation and fake news, including
fake media content. One of those tactics is to dissect
news articles (headline, lead, body, conclusion). By
examining those structures more closely, signs of
fake media could be detected. NLP is a computer
artificial intelligence, originally trying to study the
speech of the computers with human language, can
be used to analyze the content of news stories for the
presence of FF. For example, NLP approaches can
enable the investigation of the use of words in the
news and the discovery of inconsistencies that tell us
lies are being told.
3.2 Data Authentication
These blockchain and natural language processing-
based methodologies of fake media detection can
also be further modified by incorporating the data-
authentication systems. For the detection of fake
media the trustworthiness and genuineness of the
content under consideration need to be verified. the
computerized mark is one such best approach to give
information validation by guaranteeing the source
of a news article. These digital signatures, used to
authenticate the information, are generated by
cryptographic algorithms. The unforgeability of the
digital signature is ensured by including it in the
news report and recording it on a blockchain, where
anyone can easily check it.
Table1: Comparison table.
Algorithm accuracy precision Recall F1 score
NLP 88.66 86.78 89.18 87.67
RL 95.78 93.86 92.39 94.56
BlockChain 94.63 94.89 93.33 92.68
Additionally, inconsistencies in the data can be
detected via machine learning techniques. For
example, linguistic differences between the title of a
fake news piece and its body can be exposed
through teaching these algorithms. If so, these
inconsistencies may be annotated as suspicious
media. Table 1 shows the comparison table.
3.3 Proof-Of-Authority (POA)
A Proof of Authority (PoA) system involves another
set of trusted validators to validate the transactions
on the blockchain. The validators are often
companies or persons of high reputation for honesty
and honor. They verify that news reports are
authentic before they are added to the blockchain.
The PoA enables the creation of the first proven-
resistant and efficient detection of fake content. "
The validators are hesitant to violate the protocol's
integrity, depending on their trust and reputation,
and are probably not cooperating with other players
to form a cheating coalition. Natural language
processing can be used while evaluation of the
language the news story is constructed from and if it
is likely to have fake media. After the data is
formatted, the resulting data set is transmitted to
validators for validation. If news report is confirmed
by the validators to be true, the news report is added
to the blockchain, otherwise the news report is
rejected.
3.4 Fake Media
NLP can be exploited to explore the text of the
articles, identifying possible cases of manipulated
media. For example, NLP may identify linguistic
disparities, such as a mismatch in tone between the
headline and the body of the article. It can read the
sentiment of the article to know if there's any bias,
misinformation or fake news. The use of natural
language processing (NLP) solutions and the Cloud.
The FCTs can perform real-time provide a practical
method to detect false media by leveraging NLP
solutions and cloud computation capabilities. Fake
media is a term used to define article, images or
videos created with the intent to deceive or deceive
people secure and tamper-proof authentication and
storage for news stories. And any news story could
be tied to a single digital signature on the
blockchain, which we can use to trace its properness.
Using a blockchain and therefore an auditable and
tamper-evident way of securing the integrity of
processed data.
4 RESULT AND DISCUSSION
Certain metrics, such as F1 score, precision, and
recall, canbe applied to quantify how effective the
Fake News Detector in Social Media Using AI
791
proposed method isin detecting fake news. Recall
determines the ratio of true positives to all
positiveactuals, whereasprecision determines the
ratio of true positives to all predicted positive values.
F1 score, as the mean of precision and recall
inversely weighted bythe measure, determines the
general performance whereby higher values indicate
better accuracy.
A labeled dataset of real and false news articles
can be usedto evaluate the performance of the
proposed system by comparing its predictions. The
accuracy, recall, and F1 measure of the system can
then be calculated based on its predictions. The
performance of the system can also be evaluated by
comparing it with other existing state-of-the-art fake
news detection methods. The overall performance of
the suggested system in identifying false news is
determined by several factors, including the purity
of the dataset, the effectiveness of the utilized NLP
methods in preprocessing data, thearchitecture of
theRL agent, and theaccuracyof the utilized
blockchain technology in providing security for data.
Ongoing testing and evaluation are important to
examine the performance of the system and identify
what needs to be improved. Figure 1 shows the
blockchain.
Figure 1: Comparison of accuracy, precision, recall, and
F1 score for NLP, RL, and Blockchain.
5 CONCLUSION AND FUTURE
SCOPE
To sum up, recognizing news as fake has an
important role in the current society with the
potential to produce a wide range of effects. The
offered solution based on blockchain, reinforcement
learning, and natural language processing is a
possible way to resolve this problem. By pre-
processing and feature extraction for news text with
NLP methods, the RL agent can be trained to learn
the patterns to distinguish between real news and
fake news. In the future, the method might be used
for anything that's communicated in text and could
become a powerful tool for halting the spread of
fake news and facilitating the spread of real news.
There are a number of directions that future work
could take in order to improve the proposed false
news detection technique. One possible
improvement is feature extraction, where more
features can further tune the RL agent to better
differentiate between real and fake news. Moreover,
the adoption of deep learning models combined with
other NLP advanced techniques for enhancing the
overall performance of the system could provide a
great potential. Figure 2 shows the flowchart of the
proposed system architecture.
Figure 2: Flowchart of the proposed system architecture.
REFERENCES
In a report published in 2020 on arXiv:1606.05464,
Augenstein,T. Rocktaschel, A. Vlachos,and K.
Bontcheva studied posture identification using
bidirectional conditional encoding. AtIberEval2017,
M.Taule,M.A.Martí,F.M.Rangel,
P. Rosso, C. Bosco, and V. Patti gave a summary of the
gender and stance recognition job in tweets about
Catalan independence. The 2017 2nd Workshop on
Evaluating Human Language Technologies for Iberian
Languages (CEUR-WS), volume 1881, included this
paper as one ofits proceedings.
Multilingual stance detection in political conversations on
social media was the subject of research by M. Lai,
T. Cignarella, D. I. Hernández Farias, C. Bosco, V.Patti,
and P. Rosso. Their study was published in September
2020 under the publication number 101075 in the
journal Computational Speech and Language.
ICRDICCT‘25 2025 - INTERNATIONAL CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION,
COMMUNICATION, AND COMPUTING TECHNOLOGIES
792
A baseline strategy was put forth by B. Riedel,
Augenstein,GeorgiosSpithourakis,and S. Riedelfor the
Fake News Challenge posture identification challenge.
Their study was published in May 2018 and offered a
straightforward but efficient method.
Fake Chain: A blockchain architecture to ensure trust in
social medianetworks was published byS. Ochoa, G.D.
Mello, L. A. Silva, A. J. Gomes, A.M.R. Fernandes,
and V.R.Q.LeichhardtinProc.Int.Conf.Qual,Inf.
"A Dynamic Technique for Identifying theFalse News
Using Random Forest Classifierand NLP," J. Vijay,
H.Basha, and J. A. Nehru, Computational Methods
and Data Engineering, 2021, Springer, pp. 331-341.
In a “Deep learning for fake news identification,” a
matched textualin put structure, Written by D.
Mouratidis and published in February 2021 in
Computation, vol. 9, no. 2, p. 20.
Weak supervision for false news detection via
reinforcement learning was published by Y. Wang, W.
Yang, F. Ma, J. Xu, B. Zhong, Q. Deng, and J. Gao in
Proc.AAAIConf.ArtificialIntelligence,vol.34,2020,pp.
516-532.
Chokshi and R. Mathew’s study, “Deep learning and
natural language processing for fake news detection:
research.” SSRN Electronic Journal, January 2021.
[Online]. Available with abstract id=3769884 at
papers.ssrn.com/sol3/papers.cfm
Inresearchreleasedin2019, C.Dulhanty,J.L.Deglint, I.b.
Daya, and A. Wong investigated automatic
misinformation assessment using deep bidirectional
transformer language models for stance identification.
Fake News Detector in Social Media Using AI
793