Fake News Detection in Social Networks using Machine Learning:
A Review
Sonali Raturi, Amit Kumar Mishra and Srabanti Maji
School of Computing, DIT, Dehradun, Uttarakhand, India
Keywords: Fake News, Machine Learning(ML), Support Vector Machine (SVM), Naïve Bayes, Social media
Abstract: Fake News is spreading so rapidly these days. This is low-quality news that is generated to targeted
someone. This could be created for financial gain or political gain. In no time, millions of tweets are
generated and that could be false, people start believing in fake news when there is not enough information
available to examine whether the information or the tweet that has been created is true or false and also
people start believing in the information that they hear frequently and that could be false. It has been
continuing since traditional media but now it is easier in social media to share or comment on such false
information. With the growth of this false news or information, it is impossible to manually filter such news.
So, there is some computational approach to recognize fake news with different Machine Learning
Algorithms like SVM, Naïve Bayes, etc. This review paper mentioned different types of techniques required
to detect hoax news. Also discussed different methods used in existing models with their accuracy.
1 INTRODUCTION
We live in a society where people generally depend
on socia1 media principles where many people are
likely to look up and get news from social media
instead of traditional news such as newspapers.
False news is poor-quality news that contains false
news which is intentionally created. The vast spread
of fake news day by day has the ability for
tremendous bad effects on society or any individual.
Fake news is written to mislead readers so that they
could believe false information that is intentionally
generated, that makes it hard to detect fake news
dependent on report contents only hence, we need to
involve reserved information, that could be useful’
social involvements on social media which help to
form a conclusion.
Social Media is in a timely fashion and not that
much expensive for consumers to consume news
rather than other traditional news media like
newspapers so it makes it easy to share news further
or comment on and that news is sharing on social
media could be fake.
However, news articles are produced online
because it is low-cost and faster to release news
through social media. These are produced online for
different purposes like political and financial gain.
Fake data is spreading over social media and fake
cures also. Now how to distinctly differentiate real
news, misinformation opinion: False news can
identify by comparing various properties and
theories in both media i.e., social media and
traditional media. Now, define the fake news
detection difficulties and will summarize the
techniques to detect fake news. Next, define the
datasets that will be used in this method and
evaluation of a new model used by existing methods.
Two main features of this definition: intent and
second is authenticity. First, false news involves
false knowledge that can be proved as it is. Second,
fake news is generated to mislead consumers with
dishonest intentions.
There could be several reasons for spreading
fake news.
Fake news could be rumors that are generally not
generated from any news events, only for political
gain or any financial gain or it could be
misinformation that is generated unpremeditated.
Fake news could be produced by fun or to hustle a
specific person. Recently, fake news is dynamic as
changing its phase from traditional media to social
media or online news.
Raturi, S., Mishra, A. and Maji, S.
Fake News Detection in Social Networks using Machine Lear ning: A Review.
DOI: 10.5220/0010564800003161
In Proceedings of the 3rd International Conference on Advanced Computing and Software Engineering (ICACSE 2021), pages 177-181
ISBN: 978-989-758-544-9
Copyright
c
2022 by SCITEPRESS Science and Technology Publications, Lda. All rights reserved
177
2 LITERATURE REVIEW
Here are two components that make users naturally
endangered to false news:
Naïve Realism- In this, users start believing that
their viewpoints for reality are the only views that
are accurate and, those whose viewpoints vary are
considered as prejudiced(Ward, 2013).
Confirmation Bias- In this, users believe to receive
only that information that their existing views are
confirmed(Nickerson, 1998).
Venomous accounts could be created online. The
major reason for venomous accounts could be the
cost-effectiveness of creating an account on social
media. It is less expensive to create bots online for
social media. A bot could be an account on social
media and is managed by different computer
algorithms so that it can produce content and link
with bots or people automatically on social
media(Ferrara, 2016). Social bots are said to be
venomous entities when it is designed with a
specific purpose, basically to harm, such as to spread
or manipulate false news on media. People start
believing in false news on account of the following
factors:
Due to the credibility on social media, that
means users review a source of fake news as
credible if others review the same source as credible.
And they do so when there is not enough
information available to decide whether the source is
fake or real, or the truthfulness of any source.
Due to the frequency heuristic, that means users
naturally start supporting that information which
they hear time and again even it could be fake news.
2.1 Techniques Used in Fake News
Detection
Naïve Bayes: In ML,(Yuslee, 2021) naïve bayes is
simply a “probabilistic classifier” which is based
upon applying Naïve Bayes theorem using naïve
independence suppositions between the features.
P(S|T) = {P(T|S) P(S)} / P(T), Where P(S|T) is the
probability of S when T has already occurred, P(T|S)
is the probability of T when S has already occurred,
P(S) is the probability of S occurring, and P(T) is the
probability of T occurring.
The above equation can be written as:
Posterior = {prior x likelihood} / evidence
Support Vector Machine (SVM)(Fung, 2002), is
a supervised Machine Learning algorithm and can be
used for classification problems or regression
problems. It uses a technique that transforms your
data and then observes an optimal boundary based
upon those transformations, and this optimal
boundary should be between the possible outputs.
This technique is called the kernel trick. SVM is
capable of doing regression and classification.
Regression is a supervised Machine Learning
algorithm and it is a subdivision of ML
algorithms(Mahir, 2019). It foretells the product
values based upon input values from the data fed in
the system. The algorithm creates a model on the
features of training data.
SGD “Stochastic Gradient Descent”, a very
common and popular algorithm used in different
Machine Learning tasks, mainly builds the basis of
NN(Helmstetter, 2018). Gradient means in SGD is
slope of a surface or it could be slant of any surface.
Hence, gradient descent in SGD means decreasing a
slope to reach the lowest point on that
surface(Zhang, 2020). Random forest algorithm is
basically a supervised algorithm. In this algorithm,
comes a direct relationship between the no of trees
in the forest and results it can get. In simple words,
the larger the number of trees, the more precise the
result(Stahl, 2018).
2.2 Types of Data Present in Social
Media
As discussed in a paper(Parikh, 2018), three types of
data are available in social media posts, Text data
(Multilingual) which focuses the root of text in
systematically and semantically manner. This data is
analyzed by computational linguistics, since many
posts are produced in texts format so much work has
been executed. Second, Multimedia is multiple
forms of media that is combined in a post.
Multimedia could be an audio, images, graphics and
video. This is an attractive data type and it raise the
attention of the viewers and third is Hyperlink.
ICACSE 2021 - International Conference on Advanced Computing and Software Engineering
178
Table 1: Comparitive performance measurements of
various Fake News Detection techniques
Serial
no.
Paper
Studie
d
Approach Result Gap
1 “Fake
News
Detectio
n
System
using
Article
Abstract
ion”
(2019)
Natural
Language
Processing,
Article
abstraction,
Sentence
matching,
Deep
learning
They
proposed
BiMPM
(Bidirection
al
MultiPerspe
ctive
Matching)
model using
article
abstraction
and entity
set matching
with 0.663
AUC
accuracy
They will
propose a
different
technique
which will
use entity
matching set
and article
abstraction
and along
with
BiMPM
model
2 “Autom
atic
Online
Fake
News
Detectio
n
Combin
ing
Content
and
Social
Signals”
(2018)
social-based
and content-
based
methods
They
proposed
false news
detection
method and
execute this
method on
Facebook
Messenger
chatbot with
81.7%
accuracy
They will
propose a
new method
to train the
bot in
different
languages in
order to
elongate it
to various
countries
3 “Detecti
ng Fake
News
using
Machin
e
Learnin
g and
Deep
Learnin
g
Algorith
ms”
(2019)
RNN, SVM,
Naive Bayes
Logistic
Regression
They
proposed a
model to
check the
affirmation
of news
pulled out
from Twitter
which is
helpful for
fake news
recognition
with
accuracy
0.94
In future,
they could
pull out
name
entities from
news body
or news
headline and
then
examine
their
relationships
using
knowledge
b
ase
4 “Weakl
y
Supervi
sed
Learnin
g for
fake
news
detectio
n on
Twitter”
(2018)
Weakly
supervised
Classificatio
n
They
proposed a
weakly
supervised
method
which
impulsively
collects
large scale
datasets with
0.9 F1 score
In future,
they could
resolve the
main
challenge
this method
faced and
that is to
congregate a
training
dataset of
suitable size.
5
“FAKE
DETEC
TOR:
Effectiv
Data
Mining,
Text
Mining,
They
proposed an
automatic
false news
In future,
experiments
can be done
on live false
e Fake
News
Detectio
n with
Deep
Diffusiv
e Neural
Networ
k”
(2018)
Diffusive
Network
credibility
inference
model which
they have
named as
FAKEDETE
CTOR with
0.63
accuracy
score.
news
dataset.
6
“Fake
news
detectio
n in
social
media”
(2018)
SVM,
Semantic
Analysis,
Naïve Bayes
Classifier
They
proposed a
three-part
method.
In future,
this
proposed
method will
be test out.
In this
paper, this is
yet to do due
to limited
knowledge
and time
7
“Fake
Data
Analysi
s and
Detectio
n Using
Ensemb
led
Hybrid
Algorith
m”
(2019)
Classificatio
n, Decision
tree, Natural
Language
Processing,
Random
forest, Naïve
bayes,
SVM, KNN
They
proposed a
hybrid
approach
false news
detection
with 94%
accuracy
In future
work, this
algorithm
will
compare
with the
deep NN
and then test
result will
be drawn.
This can be
done to save
time in
training the
deep NN
8
“Hoax
Web
Detectio
n for
News in
Bahasa
Using
Support
Vector
Machin
e”
(2019)
Text
Mining,
Support
Vector
Machine
They
proposed a
model that
aims is to
find fake
and real
news. This
system is
done on
Indonesian
Language
with an
accuracy of
85%
In future,
this work
can be done
on other
languages
9
“An
Integrat
ed
approac
h for
Malicio
us
Tweets
detectio
n using
NLP”
(2017)
Machine
Learning,
Statistical
Natural
Language
Processing
They
proposed a
method
which is
based on
two aspects:
without
knowing
previous
background
of the
consumer,
the
affirmation
of spam-
tweets and
the other
In future,
this method
can focus on
user
accounts as
now it only
focusses
mainly on
analyzing of
tweets
Fake News Detection in Social Networks using Machine Learning: A Review
179
based on
analysis of
language for
detecting
spam on
twitter with
93%
accuracy
10
“Fake
News
Detectio
n Using
Sentime
nt
Analysi
s”
(2019)
Random
Forest,
Naïve Bayes
They
proposed a
method for
fake news
detection
which
integrates
sentiment to
improve the
accuracy
(0.88 AUC)
In future
work,
dataset can
be images as
well as
videos in
addition to
this method
11
“A
Sensitiv
e
Stylistic
Approa
ch to
Identify
Fake
News
on
Social
Networ
king”
(2020)
One-Class
SVM
They
proposed a
method to
find false
news in
texts format,
pull out
from social
media with
an accuracy
of 86%
In future
work,
accuracy
and
precision
could
increase
12
“Not
Everyth
ing You
Read Is
True!
Fake
News
Detectio
n using
Machin
e
learning
Algorith
ms”
(2020)
NLP, K-
nearest
neighbor,
ML
They build a
fake news
detector
which
classify text
or the news
headlines as
hoax or non-
hoax with
71%
accuracy
This
detector can
be build
using
different
algorithms.
3 RESULT AND DISCUSSION
In literature review section different techniques have
been used to proposed fake news detector model like
Naïve Bayes, (Yuslee, 2021), SVM, (Fung, 2002).
As mentioned in a paper(Bhutani, 2019), they have
done accuracy comparison between different
Machine Learning algorithms. Firstly, they have
tested Naïve Baye Model on each vector, so it gives
73% accuracy on count vector, 75% on N - gram
vector, and character vector Word Level TF-IDF as
well. Then regression model was executed. It gives
76% and 74% on count respectively and word level
features. Thirdly, SVM was performed and it gives
accuracy of 74% in all the features.
Figure 1 shows(De Oliveira, 2020) the overall
accuracy comparison chart of Deep Learning and
Machine learning algorithms including SVM,
Logistic Regression, Naïve Bayes, RNN, LSTM.
Figure 1. Comparison of Deep Learning and Machine
learning algorithms based on accuracy
Figure 2 shows the accuracy of different methods
used in the literature review Section II. Accuracy of
Random forest technique(Reddy, 2019) is 94% when
the author proposed a hybrid approach false news
detection. Second, Article Abstraction(Kim, 2019)
gives an accuracy of 66.30 % when they proposed
BiMPM (Bidirectional MultiPerspective Matching
model and one of the authors proposed(Della
Vedova, 2018) false news detection method with
content based and social based methods that gives an
accuracy of 81.70%. Accuracy of SVM(Rahmat,
2019) is 85% when it is used in hoax web detection
system. Statistical NLP accuracy(Gharge, 2017) is
93% when author proposed an integrated approach
for false tweets detection. KNN gives the
accuracy(Tiwari, 2020) of 71% when they build a
fake news detector.
Figure 2. Overall accuracy of different techniques.
94%
66,30%
81,70%
85%
93%
71%
0%
20%
40%
60%
80%
100%
Random
Forest
Content
Based&
SocialBased
StatisticalNLP
Accuracy
ICACSE 2021 - International Conference on Advanced Computing and Software Engineering
180
4 CONCLUSION
In this manuscript, we summarized various Machine
Learning techniques used in detecting false news
and the type of data we see on social media posts
i.e., text, multimedia or hyperlinks. Whereas there is
conspicuous achievement in detection of false news
or fake posts with the use of various Machine
learning approaches. Although, dynamic features of
hoax news in social media is causing problem in
classification of false news. These days false news is
creating various issues from sarcastic articles to a
fabricated news. Lack of trust and false news in the
media are raising problems with great effect in our
society.
Although, the main feature of Machine Learning
is its potentiality to robotize repetitive tasks and
consequently, increasing productivity. Lots of
research work is going to execute Machine Learning
methods like Naïve Bayes, SVM, Random forest,
KNN.
REFERENCES
Bhutani, B., Rastogi, N., Sehgal, P., & Purwar, A. (2019,
August). Fake news detection using sentiment
analysis. In twelfth international conference on
contemporary computing (IC3) (pp. 1-5). IEEE.
De Oliveira, N. R., Medeiros, D. S., & Mattos, D. M.
(2020). A sensitive stylistic approach to identify fake
news on social networking. IEEE Signal Processing
Letters, 27, 1250-1254.
Della Vedova, M. L., Tacchini, E., Moret, S., Ballarin, G.,
DiPierro, M., & de Alfaro, L. (2018). Automatic
online fake news detection combining content and
social signals. In 22nd Conference of Open
Innovations Association (FRUCT) (pp. 272-279).
IEEE.
Ferrara, E., Varol, O., Davis, C., Menczer, F., &
Flammini, A. (2016). The rise of social
bots. Communications of the ACM, 59(7), 96-104.
Fung, G., Mangasarian, O. L., & Shavlik, J. W. (2002).
Knowledge-based support vector machine classifiers.
In NIPS (pp. 521-528).
Gharge, S., & Chavan, M. (2017). An integrated approach
for malicious tweets detection using NLP.
In International Conference on Inventive
Communication and Computational Technologies
(ICICCT) (pp. 435-438). IEEE.
Helmstetter, S., & Paulheim, H. (2018). Weakly
supervised learning for fake news detection on
Twitter. In IEEE/ACM International Conference on
Advances in Social Networks Analysis and Mining
(ASONAM) (pp. 274-277). IEEE.
Kim, K. H., & Jeong, C. S. (2019). Fake news detection
system using article abstraction. In 16th International
Joint Conference on Computer Science and Software
Engineering (JCSSE) (pp. 209-212). IEEE.
Mahir, E. M., Akhter, S., & Huq, M. R. (2019). Detecting
fake news using machine learning and deep learning
algorithms. In 7th International Conference on Smart
Computing & Communications (ICSCC) (pp. 1-5).
IEEE.
Nickerson, R. S. (1998). Confirmation bias: A ubiquitous
phenomenon in many guises. Review of general
psychology, 2(2), 175-220.
Parikh, S. B., & Atrey, P. K. (2018). Media-rich fake news
detection: A survey. In IEEE conference on
multimedia information processing and retrieval
(MIPR) (pp. 436-441). IEEE.
Rahmat, M. A., & Areni, I. S. (2019 ). Hoax Web
Detection For News in Bahasa Using Support Vector
Machine. In International Conference on Information
and Communications Technology (ICOIACT) (pp.
332-336). IEEE.
Reddy, P. B. P., Reddy, M. P. K., Reddy, G. V. M., &
Mehata, K. M. (2019, March). Fake data analysis and
detection using ensembled hybrid algorithm. In 2019
3rd International Conference on Computing
Methodologies and Communication (ICCMC) (pp.
890-897). IEEE.
Stahl, K. (2018). Fake news detection in social
media. California State University Stanislaus, 6, 4-15.
Tiwari, V., Lennon, R. G., & Dowling, T. (2020). Not
Everything You Read Is True! Fake News Detection
using Machine learning Algorithms. In 31st Irish
Signals and Systems Conference (ISSC) (pp. 1-4).
IEEE.
Ward, A. (2013). Naive realism in everyday life:
Implications for social conflict and
misunderstanding. Values and Knowledge, 103.
Yuslee, N. S., & Abdullah, N. A. S. (2021). Fake News
Detection using Naive Bayes. In IEEE 11th
International Conference on System Engineering and
Technology (ICSET) (pp. 112-117). IEEE.
Zhang, J., Dong, B., & Philip, S. Y. (2020). Fakedetector:
Effective fake news detection with deep diffusive
neural network. In IEEE 36th International
Conference on Data Engineering (ICDE) (pp. 1826-
1829). IEEE.
Fake News Detection in Social Networks using Machine Learning: A Review
181