citations, shedding light on both scientific and non-
scientific influences. Generally, these inquiries
underscore the intricate nature of citation behavior
and its crucial role in evaluating scholarly output. The
research (Mammola,
Piano, Doretto, Caprio, and
Chamberlain, 2022) emphasizes that while scholarly
content should be the primary basis for citing, other
elements such as the length of the paper, the number
of authors, their collaborative networks, and
individual characteristics can also influence citation
behaviors.
The paper (Prabha, 1983) suggests that more than
two-thirds of references in academic papers are
unnecessary, highlighting the prevalent issue of
questionable citations. The research presented by
(Wilhite and Fong, 2012), as well as by (Wren and
Georgescu, 2022), has delved into various aspects of
reference list manipulation, uncovering practices
such as coercive citation and unusual referencing
patterns as departures from established standards.
Traditional methods for identifying citation
manipulation involve experts carefully examining
citation patterns in scholarly articles. This process
entails assessing the relevance and context of
citations, detecting potential biases or
inconsistencies, and exploring the relationships
between cited and citing works. While manual review
can provide valuable insights by leveraging the
expertise of subject matter specialists, it is labor-
intensive and challenging to implement on a large
scale. With the increasing volume of academic
publications, the shortcomings of manual detection
methods have become increasingly evident. As a
result, automated approaches have been developed to
improve efficiency and consistency in identifying
citation manipulation.
Several studies highlight the utility of network
analysis in detecting citation manipulation. Research
(Ding, Y., 2011) explores the connection between
collaboration and citation patterns, while (Liu, J., Bai,
X., Wang, M., Tuarob, S., & Xia, F, 2024) introduces
ACTION, a framework for identifying anomalous
citations in heterogeneous networks. A study
[Isfandyari-Moghaddam, A., Saberi, M. K.,
Tahmasebi-Limoni, S., Mohammadian, S., &
Naderbeigi, F., 2023) examines co-authorship
networks among leading research nations.
Studies (Avros, Haim,
Madar, Ravve, and
Volkovich, 2023) and (Avros, Keshet, Kitai, Vexler,
and Volkovich, 2023) have investigated the
automation of detecting manipulated citations in
academic papers using advanced graph-based
techniques. These considerations have constructed
robust frameworks that scrutinize citation networks'
structural and contextual relationships by employing
self-learning graph transformers, perturbation
methods, and Graph embeddings.
The current paper addresses the challenge of
assessing the reliability and consistency of citations
within a citation network. Following the general
standpoint outlined in the mentioned works, the aim
is to investigate the stability of ideal ("genie")
references under network distortions. This core
problem can be reframed in the context of anomaly
detection using an Encoder-Decoder model.
Specifically, the methodology leverages the model's
ability to learn the underlying structure of normal
(i.e., consistent and reliable) citation patterns.
Trained solely based on these normal citation
examples, the model learns a compressed latent
representation that facilitates an accurate
reconstruction of such citations. While the model
succeeds at reconstructing normal citation data with
minimal error, it struggles with anomalous citations
that are unreliable or inconsistent and thus deviate
from the learned patterns. Critically, the difference
between the original citation data and its
reconstructed version, the reconstruction error, serves
as the primary metric for identifying these anomalous
citations.
The process presented in this study is inspired by
the work outlined by (Jin, Xu, Cheng, Liu, and Wu,
2022). This paper addresses the limitations of
traditional link prediction methods by proposing a
novel approach utilizing Generative Adversarial
Networks (GANs). The suggested method organizes
the network into hierarchical layers, preserving local
and global structural features. A GAN is employed to
iteratively learn low-dimensional vector
representations of vertices at each layer, using these
representations to initialize the previous layer.
In our study, we utilize a modified version of this
method. We randomly remove a fixed fraction of
citations (edges) from the network through multiple
trials. The described GAN-based approach is then
employed to predict the missing citations, comparing
them with the omitted ones. The reconstruction rate
calculated within the trials indicates the reliability of
the corresponding edges. So, successful predictions
indicate the likely importance of the citation, while
failed predictions suggest potential irrelevance or
inclusion for non-scholarly reasons.
The subsequent sections of the paper are
dedicated to presenting the necessary background
concepts, describing the proposed model, and
reporting numerical results. At this stage, we aim to
validate the proposed model using just a single
dataset, with plans to extend the study and evaluate