Hybrid Graph Neural Network and Capsule Network Model for
Lung Disease Diagnosis
Radha J., Santhosh T. K., Sivashankar C. M. and Vignesh K.
Department of Computer Science and Engineering, Nandha Engineering College, Erode, Tamil Nadu, India
Keywords: Graph Neural Networks, Capsule Networks, Lung Disease Diagnosis.
Abstract: Despite advancements in technology, the diagnosis of lung diseases like pneumonia, tuberculosis, and lung
cancer remains a significant concern for worldwide health. The paper introduces a new hybrid approach that
uses Graph Neural Networks (GNN) and Capsule Network (CapsNet) to address the challenges of combining
structured data with unstructured data. By creating a disease-symptom graph, the GNN component is utilized
to model complex relationships in structured patient data and enhance understanding and prediction of disease
progression. CapsNet simultaneously processes the unstructured image data, capturing hierarchical spatial
characteristics that enhance the model's interpretation and performance. The integration of these two aspects
enhances the categorization of lung diseases, leading to a more precise and comprehensive diagnostic model.
The LIDC-IDRI lung CT scan dataset and the NIH ChestX-ray14 dataset are two publicly available datasets
that serve as models for the proposed system's evaluation. According to experimental evidence, the hybrid
GNN + CapsNet model is significantly better than both traditional CNN and Transformer-based models.
Specifically, we have found that our approach to multi-class lung disease classification is much more accurate
than existing methods. In this paper, they highlight the novel integration of graph-based learning for structured
data and capsule networks for image analysis, which exceeds current diagnostic models.
1 INTRODUCTION
Lung diseases such as pneumonia, tuberculosis and
lung cancer remain major global health challenges
with high rates of morbidity and mortality. Effective
treatment and improved patient outcomes can be
achieved through early diagnosis and accurate
diagnosis. Diagnosis historically has relied mainly on
clinical expertise and medical imaging such as X-rays
and CT scans, but these methods are often limited by
their inability to effectively integrate different data
sources. Convolutional Neural Networks (CNNs) and
other deep learning techniques have emerged,
allowing automated systems to analyse medical
images to significantly improve diagnostic accuracy.
This has been particularly useful in this context. Even
so, existing models utilizing CNN's data mostly rely
on unstructured data (e.g., images) and are not easily
integrated with structured data such as Electronic
Health Records (EHRs), which contain important
patient information like symptoms records, medical
history, and demographics. Multimodal data
processing is hindered by the inability of the model to
perform well in clinical settings. Increasing interest is
being directed towards models that can aggregate
structured and unstructured data to better classify
diseases into complete and robust classes. The paper
proposes a hybrid deep learning model that utilize
both, GNN and Capsule Networks (Caps burg) to
handle structured data as well as unstructured data.
By using GNN, it can model the correlations between
disease symptoms and patient history from EHRs;
and Coseismal is used to capture the spatial
orderliness of medical images (X-rays, CT scans) that
are structured into subsets. The model's combination
of these two components enhances both diagnostic
accuracy and interpretability, which is essential for
clinical acceptance. We show that this hybrid model
is more accurately classifying than conventional
CNNs and Transformer-based models, allowing us to
identify lung disease.
2 RELATED WORKS
Lung disease diagnosis has been greatly improved by
the use of convolutional neural networks (CNNs)
through deep learning. The NIH ChestX-ray14
238
J, R., T K, S., C M, S. and K, V.
Hybrid Graph Neural Network and Capsule Network Model for Lung Disease Diagnosis.
DOI: 10.5220/0013880700004919
Paper published under CC license (CC BY-NC-ND 4.0)
In Proceedings of the 1st International Conference on Research and Development in Information, Communication, and Computing Technologies (ICRDICCT‘25 2025) - Volume 2, pages
238-243
ISBN: 978-989-758-777-1
Proceedings Copyright © 2025 by SCITEPRESS Science and Technology Publications, Lda.
dataset was used to detect pneumonia using CNN,
with an accuracy that was considered similar to or
greater than that of radiologists, according to one
widely cited study. A deep feature fusion approach
was introduced by Tang et al. (2021), which involved
merging multiple CNN architectures to improve the
classification of lung disease with greater accuracy.
Their model utilized feature extraction from multiple
CNN layers to enhance its robustness and allow for
generalization across a diverse range of medical
imaging datasets. By capturing more spatial and
contextual information, this approach exceeded
traditional CNN models. By integrating multimodal
medical data, graph Neural Networks (GNNs) have
become effective tools for disease prediction. The
accuracy of disease classification is improved by
using a GNN-based framework that incorporates
structured and unstructured data, as demonstrated by
Zhang et al. (2022). They use patient records, medical
imaging and clinical notes to create graph
representations using methods that capture complex
relationship relationships missing from traditional
deep learning models. [M]. The study found that
GNNs are more generalizable and easier to interpret
than traditional methods for predicting diseases.
Through their ability to model dependencies among
different medical features, GNNs offer a robust
solution for managing diverse healthcare data. The
importance of GNNs in medical AI is highlighted by
this study, which aims to enable integration with
advanced models like Capsule Networks (Carpentry)
to improve diagnostic accuracy and decision-making
in clinical settings. Multimodal learning has become
a prominent area of interest in medical diagnosis, as
it allows for the integration of diverse data sources,
such as clinical records and medical images, to
enhance diagnostic precision. Xu et al. (2022) put
forward a deep learning model that utilizes clinical
and imaging data to diagnose diseases in varying
ways, with better accuracy than unimodal methods.
The authors emphasized the importance of
incorporating structured patient information with
radiological features to accurately capture complex
disease patterns. In the same vein, recent advances in
AI-based healthcare systems have explored the use of
Graph Neural Networks (GNs) and Capsule Network
(CapsNet) to extract features and learn about
hierarchical representation. Researchers are utilizing
multimodal learning techniques to construct stronger
diagnostic models that can be understood more easily.
We have developed a hybrid GNN-CapsNet approach
that incorporates patient history and imaging data to
classify lung diseases and improve clinical relevance
by leveraging imaging. By utilizing electronic health
records (EHRs) to improve clinical decision-making,
artificial intelligence (AI) and natural language
processing (NLP) have made possible the diagnosis
of respiratory diseases more accurately. A new
classification model, using GNN-based data analysis
with multimodal medical data, was presented by
Zhang et al. (2022). Recent research has revealed that
NLP can effectively identify essential clinical
features from unstructured EHRs, thereby aiding in
the classification of diseases. The F1 score of
LungDiag for top-1 diagnosis and the F2 score for
best-3 diagnoses is 0.711, which is higher than the
diagnostic performance of human experts and AI
models such as ChatGPT 4.0. It is one of the leading
works in this field. AI-powered diagnostic tools can
reduce misdiagnosis and improve healthcare
efficiency.
3 METHODOLOGY
3.1 Graph Neural Network (GNN) for
Structured Data Analysis
Graph Neural Networks (GNN) are utilized to analyse
EHRs, patient data, and clinical information in
structured patient datasets. Disease-symptom graph:
Nodes represent diseases, symptoms and risk factors;
edges show their relationships and dependencies.
GNN model’s intricate links between symptoms and
the development of the disease by utilizing this
interrelated data, providing a more precise predictive
model.' In contrast to traditional machine learning
models, GNN employs graph connectivity as a means
of improving diagnostic accuracy by treating patient
data as independent variables. The system can use
relational data to identify critical risk patterns, which
will enable better early diagnosis and personalized
treatment recommendations for lung diseases.
3.2 Capsule Network (CapsNet) for
Unstructured Image Data
Capsule Network (CapsNet) is utilized to analyze
images from chest X-ray and CT scans, documenting
hierarchical spatial relationships and structural
patterns of lung abnormalities. Unlike traditional
Convolutional Neural Networks (CNNs), which often
lose spatial hierarchies due to max-pooling
operations, CapsNet preserves spatial information
through dynamic routing. Through this mechanism,
crucial diagnostic characteristics like lesion shape
and size are accurately accounted for to minimize
Hybrid Graph Neural Network and Capsule Network Model for Lung Disease Diagnosis
239
misclassification. In addition, CapsNet' realism
improves with changes in image orientation and
resolution as well as noise making it useful for
medical image analysis. CapsNet' insertion into this
model enhances the diagnostic accuracy, providing
more precisely and reliably classifiable lung diseases
with its new functionality.
3.3 Fusion Layer for Multimodal
Integration
A significant portion of the fusion layer is responsible
for consolidating the outputs from both the structured
patient data processing system, known as the General
Neural Network (GNN) and the unstructured image
data analysis system called CapsNet. This layer learns
to represent a joint feature by using textual clinical
information, such as symptoms and patient history,
with visual features extracted from medical images.
By combining both methods, the model enhances
diagnostic decision-making and contributes to better
understanding of lung diseases. Figure 1 shows
Fusion Layer Diagram. The system's ability to
correlate radiological findings with symptom-based
insights is one of the benefits of this process, which
reduces diagnostic uncertainty and improves
classification accuracy.
Figure 1: Fusion Layer Diagram.
3.4 Model Training and Evaluation
Two widely available datasets, namely GNN and
Caps Net, are used to train the proposed hybrid
model. A comprehensive lung CT scan dataset
containing detailed annotations. Additionally, NIH
ChestX-ray14. Comprising 112120 chest X-ray
images classified into 14 types of lung disease. The
training process enables the model to learn and
improve classification performance by learning from
both structured patient data and unstructured imaging
data. For evaluation, key performance metrics include
time-tracking and Accuracy, precision, recall, and F1-
score. This hybrid approach is then tested against, for
example, the proposed method CNN and
Transformer-based models. Experimental findings
demonstrate that the Multi-class classification of lung
diseases is being achieved more efficiently than
traditional deep learning architectures. Moreover, the
outcomes confirm the efficacy of multimodal
integration as a diagnostic tool, with improved
accuracy and strength.
4 EXPERIMENTAL RESULTS OF
HYBRID GRAPH NEURAL
NETWORK AND CAPSULE
NETWORK MODEL FOR
LUNG DISEASE DIAGNOSIS
LIDC-IDRI and NIH Chest X-ray datasets were
utilized to test the proposed GNN + CapsNet model.
The performance of this technology outperformed
that of conventional deep learning architectures like
CNNs and Transformers.
Table 1: Performance Comparison of Deep Learning Models.
Model
Accurac
y (%)
Precis
ion
Recall
F1-
Score
CNN
(ResNet-
50)
87.9
85.4
86.7
86.0
Transforme
r-based
90.5
88.3
89.6
88.9
Proposed
GNN +
CapsNet
94.2
92.8
93.5
93.1
The hybrid model proposed is highly effective in
categorizing lung diseases into multiple classes, with
an overall accuracy of 94.2%. The system exhibited.
Despite the high diagnostic precision (92.8%),
positive predictions were highly dependable and the
recall of 93.5% was maintained, effectively
ICRDICCT‘25 2025 - INTERNATIONAL CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION,
COMMUNICATION, AND COMPUTING TECHNOLOGIES
240
diagnosing true cases of lung disease. The F1-score
of 93.1%Implies how the model is able to strike a
balance between accuracy and recall. To further
validate its Robustness and generalization capability
a 10-fold cross-validation was performed consistent
outcome was obtained from the training, allowing for
the model to be used in various subsets.
The proposed Hybrid GNN-CapsNet model was
tested against the more conventional deep learning
architectures (e.g.CNN (ResNet-50) and
Transformer-based models on the same datasets as
shown in Table 1. The hybrid approach achieved the
highest level of performance, surpassing both models.
Moreover Accuracy (94.2%) Precision (92.8%)
Recall (93.5%) and F1-score (93.1%). This is because
of the superior performance of Graph Neural Network
(GNN) which efficiently integrates structured data
from various sources Electronic Health Records
(EHRs) and the Capsule Network captured in spatial
hierarchies, the lung imaging system enables it to
reduce misclassification rates. The use of this
combination improves both diagnostic accuracy and
interpretability, making it a valuable addition to the
toolkit.
Table 2: Disease-Wise Performance Metrics.
Disease
Precisio
n (%)
Recall
(%)
Pneumonia
94.1
95.3
Tuberculosis
92.8
94.5
Lung Cancer
96.5
97.2
COPD
90.2
91.8
Pulmonary
Fibrosis
91.3
92.6
The proposed Hybrid GNN-CapsNet model was
evaluated across multiple Lung disease categories,
demonstrating high classification performance. As
summarized in Table 2, the model achieved the F1
score, precision, and recall levels are above 90%. All
diseases have been tested with the highest degree of
accuracy. The probability of detecting lung cancer is
96.5% with 97.2% accuracy and 98.8% with F1 score.
The model's ability to detect malignant cases with
high reliability is emphasized, making it an
invaluable resource for understanding the disease.
Figure 2 shows Disease-wise Classification
Performance.
The trade-off between precision and recall is
illustrated by this figure 3, which demonstrates the
model's ability to predict events. This figure 3 has
different thresholds for accuracy and retrieval. A
higher area within the Precision-Recall Curve
indicates that the proposed hybrid GNN + CapsNet
model is an effective balance between decreasing
false positives and maintaining high sensitivity.
Reliable detection of lung disease is ensured, which
reduces misclassification and improves diagnostic
accuracy in real-world clinical settings.
Figure 2: Disease-Wise Classification Performance.
Figure 3: Precision Recall-Curve.
Hybrid Graph Neural Network and Capsule Network Model for Lung Disease Diagnosis
241
Figure 4: Receiver Operating Characteristic Curve.
The Receiver Operating Characteristic (ROC)
Curve in Figure 4 demonstrates the model's ability to
differentiate positive and negative cases across
different thresholds. By measuring a high Area Under
the Curve (AUC), it has been shown that the model
can accurately classify lung diseases. With a high
AUC value, the model's low false positive rate and
high true positive (RO) ratio make it highly reliable
for clinical diagnosis. A well-balanced trade-off
between sensitivity and specificity is achieved by the
ROC curve, which highlights the model's ability to
handle unbalANCEd datasets. It is particularly useful
in medical applications where reducing misdiagnoses
helps the patient's safety and improves treatment
results. The proposed hybrid GNN-CapsNet model
Offers enhanced Interpretability and explainability
which are critical in Medical diagnosis. The Graph
Neural Network (GNN) Component facilitates
effective feature selection. By analyzing structure
Electronic Health Records (EHRs) Boosting the
model's decision-making capabilities. Meanwhile,
Capsule Networks (CapsNet) preserve spatial
hierarchies in Lung imaging. This helps to reduce
misclassification errors by recording intricate spatial
relationships within medical scans. Hence, to further
enhance transparency, SHAP and Grad-CAM
visualizations. Clinical professionals were
empowered to interpret the model's predictions by
highlighting its accuracy. Disease-relevant regions in
lung scans as shown in Figures 3 and 4. the Precision-
Recall Curve and ROC Curve validate the model's.
High sensitivity and specificity reinforcing its
effectiveness inaccurate lung disease detection.
5 CONCLUSIONS
The proposed hybrid model demonstrates a
significant improvement in the accuracy and
interpretability of diagnosing lung disease. The model
employs Graph Neural Networks (GNN) and Capsule
Network (CapsNet) to analyze structured data and
unstructured image data, respectively and accurately
captures intricate relationships among symptoms, risk
factors, disease outbreaks, diagnostic procedures, and
diagnoses/complications. Overall predictive
capabilities are enhanced by integrating multimodal
data into the information at the fusion layer.
Compared to conventional CNN and Transformer-
based architectures, the hybrid approach has better
accuracy, precision, recall, and F1-score. In addition
to its high accuracy, the model's interpretability
makes it an invaluable clinical decision-making tool,
allowing medical professionals to better understand
the progression of disease and differentiate between
diagnoses made by different organs.
6 FUTURE WORK
Next studies aim to apply the model in real-time
clinical settings, such as healthcare settings and
develop an easy-to-use interface for linking with
electronic health information. In addition, methods
for supervised self-learning will be used to extract
features, especially when medical data is not labeled.
Efforts will be made to improve model generalization
by addressing cross-hospital validation and class
imbalance issues. The model will be modified to
accommodate different imaging methods and
transferred to transfer learning for more extensive
disease detection. The goal of these developments is
to enhance the model's durability, enabling it to be
used in practical medical settings.
REFERENCES
"IEEE Computational Intelligence Society," in IEEE
Transactions on Emerging Topics in Computational
Intelligence, vol. 6, no. 2, pp. C3-C3, April 2022, doi:
10.1109/TETCI.2022. 3157778..
A. Zafar, S. Muneeb, M. Amir, A. Jamil and A. A. Hameed,
"A Multi-modal Approach to Lung Tumor Detection
using Deep Learning," 2023 IEEE International
Conference on Artificial Intelligence, Blockchain, and
Internet of Things (AIBThings), Mount Pleasant, MI,
USA, 2023, pp. 1-6, doi:
10.1109/AIBThings58340.2023.10291022.
ICRDICCT‘25 2025 - INTERNATIONAL CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION,
COMMUNICATION, AND COMPUTING TECHNOLOGIES
242
A. Ashraf, N. M. Nawi, T. Shahzad, M. Aamir, M. A. Khan
and K. Ouahada, "Dimension Reduction Using Dual-
Featured Auto-Encoder for the Histological
Classification of Human Lungs Tissues," in IEEE
Access, vol. 12, pp. 104165-104176, 2024, doi:
10.1109/ACCESS.2024.3434592.
A. Ashraf, N. M. Nawi, T. Shahzad, M. Aamir, M. A. Khan
and K. Ouahada, "Dimension Reduction Using Dual-
Featured Auto-Encoder for the Histological
Classification of Human Lungs Tissues," in IEEE
Access, vol. 12, pp. 104165-104176, 2024, doi:
10.1109/ACCESS.2024.3434592.
A. Gurram and P. Ramadass, "Magnetic Resonance Image
based Lung Disorder Detection Models in Deep
Learning-A Comprehensive Survey," 2024 5th
International Conference on Smart Electronics and
Communication (ICOSEC), Trichy, India, 2024, pp.
1323-1329, doi: 10.1109/ICOSEC 61587. 2024.
10722364.
H. Dao, J. Mazel and K. Fukuda, "CNAME Cloaking-
Based Tracking on the Web: Characterization,
Detection, and Protection," in IEEE Transactions on
Network and Service Management, vol. 18, no. 3, pp.
3873-3888, Sept. 2021, doi: 10.1109/TNSM .2021.
3072874.
J. Zhang, Y. Lei, Y. Wang, C. Zhou and V. S. Sheng,
"Hierarchical Graph Capsule Networks for Molecular
Function Classification with Disentangled
Representations," in IEEE/ACM Transactions on
Computational Biology and Bioinformatics, vol. 21, no.
4, pp. 1072-1082, July-Aug. 2024, doi:
10.1109/TCBB.2022.3233354.
M. Lauridsen, L. L. Sanchez, D. Laselva and J. Kaikkonen,
"Study of Paging Enhancements for UE Energy Saving
in 5G New Radio," 2021 IEEE 93rd Vehicular
Technology Conference (VTC2021-Spring), Helsinki,
Finland, 2021, pp. 1-6, doi: 10.1109/VTC2021-
Spring51267.2021.9448765.
M. Irtaza, A. Ali, M. Gulzar and A. Wali, "Multi-Label
Classification of Lung Diseases Using Deep Learning,"
in IEEE Access, vol. 12, pp. 124062-124080, 2024, doi:
10.1109/ACCESS.2024.3454537.
N. F. Noaman, B. M. Kanber, A. A. Smadi, L. Jiao and M.
K. Alsmadi, "Advancing Oncology Diagnostics: AI-
Enabled Early Detection of Lung Cancer Through
Hybrid Histological Image Analysis," in IEEE Access,
vol. 12, pp. 64396-64415, 2024, doi:
10.1109/ACCESS.2024.3397040.
T. Grace Shalini, G. Susan Shiny, R. Saranya, P. Suresh
Babu, R. Kavitha and A. Atheeswaran, "Enhancing
Lung Disease Identification with Multimodal Data
Fusion and Deep Learning CNN Approach," 2024 5th
International Conference on Smart Electronics and
Communication (ICOSEC), Trichy, India, 2024, pp.
535-541, doi: 10.1109/ICOSEC61587.2024.10722054.
Y. Wu, J. Ma, X. Huang, S. H. Ling and S. Weidong Su,
"DeepMMSA: A Novel Multimodal Deep Learning
Method for Non-small Cell Lung Cancer Survival
Analysis," 2021 IEEE International Conference on
Systems, Man, and Cybernetics (SMC), Melbourne,
Australia, 2021, pp. 1468-1472, doi:
10.1109/SMC52423.2021.9658891.
Y. Wu, J. Ma, X. Huang, S. H. Ling and S. Weidong Su,
"DeepMMSA: A Novel Multimodal Deep Learning
Method for Non-small Cell Lung Cancer Survival
Analysis," 2021 IEEE International Conference on
Systems, Man, and Cybernetics (SMC), Melbourne,
Australia, 2021, pp. 1468-1472, doi:
10.1109/SMC52423.2021.9658891.
Y. H. Bhosale and K. S. Patnaik, "Graph and Capsule
Convolutional Neural Network Based Classification of
Lung Cancer, Pneumonia, COVID-19 using Lung CT
and Ultrasound Radiography Imaging," 2022 8th
International Conference on Signal Processing and
Communication (ICSC), Noida, India, 2022, pp. 381-
387, doi: 10.1109/ICSC56524.2022.10009568.
Z. Tariq, S. K. Shah and Y. Lee, "Multimodal Lung Disease
Classification using Deep Convolutional Neural
Network," 2020 IEEE International Conference on
Bioinformatics and Biomedicine (BIBM), Seoul, Korea
(South), 2020, pp. 2530-2537, doi:
10.1109/BIBM49941.2020.9313208.
Hybrid Graph Neural Network and Capsule Network Model for Lung Disease Diagnosis
243