Automated COPD Diagnosis from CT Scans: A Hybrid Deep
Learning and Machine Learning Approach with Explainable AI
A. Hema
1
, C. H. Hussaian Basha
2
, S. Senthilkumar
3
, B. S. Gopika
4
, R. Muthaiyan
5
and R. Ramanan
6
1
Department of Master of Computer Applications, E.G.S. Pillay Engineering College,
Nagapattinam, 611002, Tamil Nadu, India
2
Department of Electrical and Electronics Engineering, SR University, Hanumakonda 506371, Telangana, India
3
Department of Electronics and Communication Engineering, E.G.S. Pillay Engineering College,
Nagapattinam, 611002, Tamil Nadu, India
4
Department of Electrical and Electronics Engineering, Dhanalakshmi Srinivasan College of Engineering, NH-47,
Palakkad mainroad, Navakkarai post, Coimbatore – 641105, Tamil Nadu, India
5
Department of Electronics and Communication Engineering, University College of Engineering,
Thirukkuvalai, 610204, Tamil Nadu, India
6
Department of Electrical and Electronics Engineering, E.G.S. Pillay Engineering College,
Nagapattinam, 611002, Tamil Nadu, India
Keywords: Chronic Obstructive Pulmonary Disease, Computed Tomography, Convolutional Neural Networks, Early
Detection.
Abstract: Chronic obstructive pulmonary disease (COPD) is a widespread and debilitating disease of the lungs that
requires the patient to endure, requiring a precise and timely diagnosis to aid in care. In this research, an
innovative method is introduced that combines the classical machine learning methods with feature
extraction based on deep learning to detect and classify the COPD severity from Computed tomography
(CT) scans. We employ CNNs pretrained on large amounts of medical data to extract deep features showing
indicative structural changes in the lung, including emphysema and thickening of airway walls and other
morphological deformations. These features are then used by Support Vector Machine (SVM) to get the
accurate COPD severity classification. This study uses Gradient Weighted Class Activation Mapping (Grad-
CAM) and Shapley Additive explanations (SHAP) to explain the method prediction with the purpose of
increasing the transparency and interpretability and increasing confidence in AI-driven diagnostics. The
proposed methodology is validated on the LIDC-IDRI (Lung Image Database Consortium Image collection)
dataset for emphysema severity and airway abnormalities. Results of the comparison show that this hybrid
method is more accurate or robust than CNN or traditional ML methods alone. Results show the importance
of explainable and efficient AI in medical imaging in early COPD detection, monitoring of drug efficacy,
and severity assessment.
1 INTRODUCTION
Image processing plays a major role in the research
field in recent years (L. Ramachandran, et al, S.
Senthilkumar, et al). Chronic obstructive pulmonary
disease (COPD) is a major global health issue that
contributes in a major way to despair and death and
places a heavy burden on the health system in a
gradual way. This progressive respiratory condition
is associated with persistent symptom and airflow
obstruction and is often caused by prolonged
exposure to harmful substance like cigarette smoke
and environmental pollutants. Delaying the
diagnosis increases the risk of irreversible lung
damage and loss of more than half of the lungs and
their function." Imaging by Computed Tomography
(CT) has an important role in diagnosis and
monitoring of COPD for providing detailed
visualization of lung structures. CT scans are
effective means to analyse key indicators of COPD
such as emphysema, airway wall thickening and
hyperinflation. Unfortunately, the current traditional
diagnostic methods are based mainly on manual
208
Hema, A., Basha, C. H. H., Senthilkumar, S., Gopika, B. S., Muthaiyan, R. and Ramanan, R.
Automated COPD Diagnosis from CT Scans: A Hybrid Deep Learning and Machine Learning Approach with Explainable AI.
DOI: 10.5220/0013910500004919
Paper published under CC license (CC BY-NC-ND 4.0)
In Proceedings of the 1st International Conference on Research and Development in Information, Communication, and Computing Technologies (ICRDICCT‘25 2025) - Volume 4, pages
208-217
ISBN: 978-989-758-777-1
Proceedings Copyright © 2026 by SCITEPRESS Science and Technology Publications, Lda.
interpretation of radiologists, which is a time
consuming, and variable and observer biased
process. The existence of such challenges suggests
that there is a need for an automated accurate and
efficient diagnostic tool to help clinicians analyze
complex medical images exists. Medical imaging
has been revolutionized with the advocacy of image
processing and machine learning (ML)
advancements, which now make it possible to detect
and classify the diseases without the intervention of
physicians. It has been shown that deep learning
methods on Convolution Neural Network (CNN)
can identify complex patterns in medical images
very well. Although CNN based methods lack
interpretability and are complex to overfitting, their
application in specialized areas including COPD
diagnosis is ambiguous. In order to tackle these
limitations, this study presents a hybrid approach
combining deep feature extraction and classic ML
algorithms to improve species classification
accuracy and reliability in the presence of COPD
classification. High level structural features of CT
scan are extracted using pre-trained CNNs, which is
used to detect lung abnormalities related to COPD.
Traditional ML classifiers such as SVMs are applied
to these extracted features using the features and the
features process itself feeds the domain specific to
the short dataset. The above integration of method
integrated the advantages from both deep learning
and classical ML to yield an effective COPD
detection and severity assessment framework.
In the medical field, interpretability is essential
since medical personnel need to have faith in AI-
driven judgements and guarantee their ethical use.
Explainable AI (XAI) methods are integrated into
the diagnostic procedure to accomplish purpose. By
highlighting lung regions that have a major impact
on predictions, these techniques help physicians
assess AI-generated results and comprehend the
reasoning behind automated diagnoses. The LIDC-
IDRI dataset, a publicly accessible collection of
high-resolution CT scans, is used to assess the
suggested methodology. By concentrating on
important characteristics including the degree of
emphysema and changes in the structure of the
airways, this study shows that hybrid machine
learning approaches can provide precise,
comprehensible, and effective diagnostic solutions
for COPD diagnosis and staging.
2 LITERATURE REVIEW
In (Manoharan, S., 2020.) a new graph cut
segmentation algorithm is proposed, which has been
enhanced to perform lung cancer detection from CT
images. This method has the advantage of better
accuracy of segmenting soft tissues and weak edges
over conventional techniques like watershed and
basic graph cut. They have merits of less energy
consumption and higher accuracy in detecting
nodules, however, they also have limitations of high
memory usage and need further optimization of the
energy function. The proposed algorithm provides
benefit to clinical application by assisting in early
and precise lung cancer detection.
According to research (Immanuel D, J. and Leo
E, S.A., 2024), it proposes a Gradient Descent
Optimization (GDO) model for predicting
cardiovascular disease (CVD) based on machine
learning. The study used data from the UCI
repository and used techniques such as SVM, KNN,
NB, ANN, RF and GDO and the proposed GDO had
an accuracy of 99.62%. The advantages are high
sensitivity (99.65%) and specificity (98.54%) and
good performance in early CVD diagnosis. The
study has; however, some limitations including a
small dataset and need for further feature fusion for
broad application.
In research Kumar, S.,et al, a multimodal
diagnostic approach is presented that uses the CT
scan images and lung sound (cough) data. The study
uses ML and DL techniques like CNNs to reach the
accuracy of 97.5% for early COPD detection. The
merits are high accuracy, the integration of multiple
data modalities, and noise robustness in diagnostic
data. There are, however, limitations that require
large datasets for effective training of the models
and scalability and eventual implementation in
reality. The research Deng, X., et al presents a novel
framework based on the Auto-Metric Graph Neural
Network (AMGNN). Radiomics and 3D CNN
features of CT images are combined towards the
prediction of COPD stages with 89.7% accuracy.
The merits are that superior precision (90.9%) and
AUC (95.8%) are achieved over traditional methods
such as PRM biomarkers. But it is limited by
computational resource requirement and difficulty in
integrating multi-phase CT data. This technique
presents significant improvement in detecting and
managing COPD stage.
Research (Bozkurt, F., 2022) introduces the
HANDEFU framework. The system is innovative in
the combination of handcrafted, deep, and fusion
based feature extraction techniques. The LBP+SVM
Automated COPD Diagnosis from CT Scans: A Hybrid Deep Learning and Machine Learning Approach with Explainable AI
209
model for the COVID-19 diagnosis was
demonstrated to have a superior accuracy up to
99.36%. The merits of the framework are flexibility,
dynamic structure and high precision, and
limitations include computational complexity and
execution time in deep and fusion based methods.
Such an approach offers a scalable solution for early
and reliable detection of COVID-19 from medical
images. In Si, et al Hybrid method is presented using
abdominal CT images in its merits, it has rapid
processing (18.6 seconds per patient), high
diagnostic accuracy for specific tumors (e.g., 100%
for IPMN) and the ability to interpret decisions
using saliency maps. It, however, suffers from the
inability to handle imbalanced data and diagnose
normal cases. This has clinical potential for
preoperative decision making in an efficient manner.
In Gan, W.,et al, a hybrid CNN that combines
2D and 3D CNNs in order to segment lung tumor is
presented. Using a hybrid approach based on the
strength of 3D CNN to capture volumetric tumor
context and the edge detail extraction ability of 2D
CNN, we report a Dice score of 0.72, outperformed
by standalone CNNs. Strengths include; reduction of
boundary blurring and better segmentation accuracy,
however it has a negative of; computational
complexity and being sensitive to false positives.
The model can potentially be used in lung tumor
diagnosis and treatment planning. In Hossain,
M.B.,et al, a fine-tuned ResNet50 model is
introduced with two additional fully connected
layers. Using transfer learning from pre-trained
weights from different datasets it achieves validation
accuracy of 99.17% for classifying COVID 19 cases.
It has merits of high precision, sensitivity, and
adaptability to medical imaging. Nevertheless, the
study illustrates computational resource demand and
domain specificity of the dataset. Furthermore, this
method provides a promising framework for rapid
screening of COVID 19 in clinical environment.
The work Sedghighadikolaei, K. et al
investigates integrating Privacy-Enhancing
Technologies (PETs) into Deep Radiomics pipeline.
The study further provides data privacy in both
model training and inference using PETs. Robust
privacy protection, suitability for multi-institutional
collaboration, and some other merits are mentioned,
while the computational overhead and difficulty of
applying PETs to multivariate medical images are
shortcomings. This framework provides security
considerations for medical data analysis from an
efficiency and privacy point of view to real world
applications.
In the work Wang, S.,et al, a model is presented
that uses Auto-Metric Graph Neural Networks
(AMGNN), together with radiomics and CNN
features learned from inspiratory and expiratory low
dose CT images. This approach achieves a high
accuracy of 94.4% and an AUC of 96.5% with no
need for manual parameter settings, making the
approach fully automated. The merits are in
effective feature selection with Lasso and integration
of meta learning strategies for prediction. High
computational requirements and dependence on
curated datasets constrain the number of available
limitations. Significant potential exists for this
model to be used in clinical applications for
preemptive COPD management.
3 PROPOSED METHODOLOGY
Figure 1: Architecture of proposed methodology.
Interpretability plays a crucial role in the medical
domain, as healthcare professionals must trust AI-
driven decisions and ensure ethical deployment. To
achieve this, explainable AI (XAI) techniques are
incorporated into the diagnostic process. These
ICRDICCT‘25 2025 - INTERNATIONAL CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION,
COMMUNICATION, AND COMPUTING TECHNOLOGIES
210
methods highlight lung regions that significantly
influence predictions, enabling doctors to validate
AI-generated outcomes and understand the rationale
behind automated diagnoses. The proposed approach
is evaluated using the LIDC-IDRI dataset, a publicly
available collection of high-resolution CT scans.
This study demonstrates that hybrid machine
learning techniques can deliver accurate,
interpretable, and efficient diagnostic solutions for
COPD detection and staging by emphasizing key
features such as emphysema severity and airway
structural changes. Figure 1 shows the architecture
of the proposed system.
3.1 Image Preprocessing
Image preprocessing is essential to prepare raw CT
images for feature extraction and classification. This
step ensures that the input images are of uniform
quality and relevant regions of interest are isolated.
3.1.1 Lung Segmentation
Segmentation isolates lung regions from the CT
image to focus on areas affected by COPD. The U-
Net architecture is employed for segmentation due to
its effectiveness in biomedical imaging. The U-Net
uses an encoder-decoder structure with skip
connections to maintain spatial information.The
segmentation process is mathematically represented
as:
S
(
x
)
=Softmax(f

(
x
)
) (1)
where:
S(x) represent Segmented lung region,
X represent Input CT image,
𝑓

(
𝑥
)
represent U-Net segmentation
function.
3.1.2 Noise Removal
To enhance the quality of CT images, Gaussian
filtering is applied to reduce noise while preserving
edges. The Gaussian filter is defined as:
G
(
x, y
)
=

exp(−
 

) (2)
where:
x,y indicates Pixel coordinates,
σ indicate Standard deviation of the
Gaussian kernel.
3.1.3 Intensity Normalization
Pixel intensities are normalized to a standard range
(e.g., [0, 1]) to ensure consistency across all input
images. Normalization is expressed as:
I

=



 

(3)
where:
I indicate original pixel intensity,
𝐼

𝑎𝑛𝑑 𝐼

indicates Minimum and
maximum pixel intensities in the image.
3.2 Deep Feature Extraction
After preprocessing, deep features are extracted
from the segmented lung regions using a pre-trained
CNN. This step leverages the power of deep learning
to capture complex patterns associated with COPD,
such as emphysema, airway thickening, and
hyperinflation.
3.2.1 Pre-Trained CNN Selection
Pre-trained CNNs such as ResNet, InceptionV3, or
EfficientNet are used. These models are fine-tuned
to extract domain-specific features from the
segmented lung regions.The deep feature extraction
process is represented as:
F=fCNN(I

) (4)
where:
Findicate Extracted feature vector,
𝑓𝐶𝑁𝑁 indicate Pre-trained CNN feature
extraction function,
𝐼

indicate Segmented lung region.
3.2.2 Feature Maps
CNNs extract multiple feature maps from different
layers, capturing spatial and structural details of the
lung. Each feature map represents specific
characteristics, such as texture, edges, or abnormal
patterns.The output feature maps are mathematically
expressed as:
M
,
=f
CNN(I

) (5)
where:
𝑀
,
indicate Feature map at position (I, j)
in layer k,
𝑓
𝐶𝑁𝑁indicate CNN operation at layer k.
Automated COPD Diagnosis from CT Scans: A Hybrid Deep Learning and Machine Learning Approach with Explainable AI
211
3.2.3 Feature Vector Construction
The feature maps are flattened into a one-
dimensional feature vector for classification. If the
feature maps have dimensions H×W×DH \times W
\times D (height, width, depth), the resulting vector
has size𝐻 ∗ 𝑊 ∗ 𝐷(ℎ𝑒𝑖𝑔ℎ𝑡, 𝑤𝑖𝑑𝑡ℎ, 𝑑𝑒𝑝𝑡ℎ).
3.3 Feature Classification
The extracted features are classified into COPD
severity levels using classical ML algorithms. This
hybrid approach combines the strengths of CNN-
based feature extraction and classical ML for
efficient classification.
3.3.1 Support Vector Machines (SVMs)
SVMs are used to classify the features into
categories, such as normal, mild, moderate, severe,
or very severe COPD. SVMs are well-suited for
high-dimensional data and small datasets.
The SVM classifier finds the optimal hyperplane
that separates data points from different classes. The
decision function is given by:
f
(
x
)
= w
(
x
)
+b (6)
where:
W indicate Weight vector,
(
𝑥
)
indicate sFeature transformation
function,
B indicate Bias term.
The SVM optimization problem minimizes the
objective function:
min
|
|
w
|
|
+C
max (0,1−y
(w

(
x
)
+b (7)
where:
C indicate Regularization parameter,
𝑦
indicate True label of the i-th sample,
𝑥
𝑖𝑛𝑑𝑖𝑐𝑎𝑡𝑒 Feature vector of the i-th
sample.
3.3.2 Multi-Class Classification
For multi-class classification (e.g., multiple COPD
stages), one-vs-rest or one-vs-one SVM approaches
are used. Each classifier is trained to separate one
class from the rest, and the final decision is based on
the highest confidence score.
3.4 Explainable AI Integration
To ensure interpretability, the proposed
methodology incorporates explainable AI techniques
that highlight the lung regions contributing most to
the model’s predictions.
3.4.1 Grad-CAM (Gradient-weighted Class
Activation Mapping)
Grad-CAM generates heatmaps that visualize the
regions of interest in the input image. It computes
the importance weights for feature maps based on
the gradients of the output class score ycy^c with
respect to the feature map AkA^k:
a
= −


,
,
(8)
where:
𝑎
indicate Importance weight for feature
map k,
Z indicate Total number of pixels in the
feature map,
𝐴
,
indicate Activation at position (I,j) in
feature map k.
The heatmap is generated as:
L

= ReLU(
a
A
)
(9)
3.4.2 SHAP (SHapley Additive
exPlanations)
SHAP assigns importance values to each feature,
quantifying its contribution to the model’s
prediction. The SHAP value ϕi\phi_i for feature ii is
computed as:
=
|
|
!
(|
|

|
|

)
!
|
|
!
⊆{}
[f S
{
i
}
−f
(
S
)
] (10)
where:
S indicate Subset of features excluding I,
N indicate Total set of features,
𝑓
(
𝑆
)
indicate Model output for feature subset S.
3.5 Data Preparation
The proposed methodology is applied to the LIDC-
IDRI (Lung Image Database Consortium Image
Collection) dataset. The dataset contains high-
resolution CT scans with annotations of lung
abnormalities, including emphysema and airway
thickening. Data preparation involves:
1. Splitting the dataset into training,
validation, and test sets.
Augmenting the training data using transformations
(e.g., rotation, flipping) to improve model
generalization.
ICRDICCT‘25 2025 - INTERNATIONAL CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION,
COMMUNICATION, AND COMPUTING TECHNOLOGIES
212
4 EXPERIMENTAL ANALYSIS
4.1 Dataset Description
The proposed methodology is demonstrated on the
LIDC-IDRI (Lung Image Database Consortium
Image Collection) dataset. The CT scans in this
dataset have been annotated by radiologists to
provide high resolution of lung abnormalities such
as emphysema and airway changes indicative of
COPD. The dataset has key characteristics of:
Number of Images: Over 1,000 CT scans.
Resolution: High-resolution DICOM
images with varying slice thickness.
Annotations: Labels for lung abnormalities
(e.g., nodules, emphysema).
Figure 2: Sample images for LIDC-IDRI dataset.
To ensure balanced class representation, the dataset
is preprocessed to include a proportional number of
samples from each COPD severity stage: mild,
moderate, severe, and very severe.
Training and validation of proposed COPD
diagnosis model on the sample images of LIDC-
IDRI dataset shown in Figure 2. These images are
high resolution CT scan images of normal lung
tissue, lung emphysema, and airway abnormalities.
According to the figure, the specific changes in the
lung texture and density characterizing the clinical
manifestations can hamper the identification and the
classification of the COPD severity only relying on
visual characteristics. Thus, the dataset covers
diversity of lung pathologies which also ensures
robustness of the model generalization.
4.2 Data Preprocessing
Lung Segmentation: For removing useless
background, U-Net is used to segment lung
regions.
Normalization: Pixel values are normalized
to a range of [0, 1].
Data Augmentation: We augment the
training set with a random rotation, flips
and brightness.
4.3 Implementation Details
Deep Feature Extraction: The LIDC-IDRI
dataset is fine-tuned on a pre-trained
ResNet50 model. A 2,048-dimensional
feature vector is extracted from the
penultimate fully connected layer (before
breaking the output layer) of the network.
Classifier: The classifier is implemented
using support vector machines (SVMs)
with a radial basis function (RBF) kernel.
Explainability: Various methods like Grad-CAM
and SHAP used to interpret model predictions.
4.4 Experimental Setup
Hardware: Georgiades and Meyer were
using an NVIDIA RTX 3090 GPU, 128 GB
RAM, and an AMD Ryzen 9.
Software: Along with Data, some of the
common Python libraries such as
TensorFlow, Keras, Scikit-learn, and
PyTorch are used for implementation.
4.5 Cross-Validation
An appropriate k-fold cross-validation strategy can
help avoid a misleading evaluation:
k = 5: In each iteration, we split the dataset
into 5 folds in which 4 of them are used for
training and 1 is used for validation.
Performance measurements (e.g., metrics) are
reported averaged across the folds.
4.6 Evaluation Metrics
To assess the performance of the proposed
methodology, the following evaluation metrics are
used:
4.6.1 Accuracy
Accuracy measures the overall correctness of the
model by evaluating the proportion of correctly
classified samples:
Accuracy =
  (      )
  ()  ()
(11)
Automated COPD Diagnosis from CT Scans: A Hybrid Deep Learning and Machine Learning Approach with Explainable AI
213
where:
TP indicate True positives,
TN indicate True negatives,
FP indicate False positives,
FN indicate False negatives.
Table 1: Accuracy Comparison of Different COPD
Classification Methods.
Methods Accurac
y
values
(
%
)
Handcrafted features
+classical ML [5]
85.2%
End to End DL [6] 89.3%
3D CNN [7] 91.7%
Transfer learning [8] 93.1%
Radiomics [9] 92.5%
Proposed
methodolog
y
95.4%
Figure 3 and table 1 illustrates the accuracy
comparison of different methodologies used for
COPD classification. The accuracy metric evaluates
the proportion of correctly classified cases in
relation to the total number of cases analyzed. The
results show that the proposed hybrid approach,
which integrates deep feature extraction with
classical machine learning, achieves the highest
accuracy compared to standalone CNNs,
handcrafted feature-based classifiers, and transfer
learning models. This improvement indicates the
efficiency of integrating deep learning and classical
ML technique for more reliable COPD diagnosis.
Figure 3: Accuracy analysis.
4.6.2 Precision
Precision quantifies the ability of the model to avoid
false positives and is defined as:
𝑃𝑟𝑒cision
  ()
 
(

)
 
(12)
A high precision value indicates that the model is
reliable in making positive predictions.
Table 2: Precision Analysis for COPD Classification
Models.
Methods Precision values (%)
Handcrafted features
+classical ML [5]
82.5%
End to End DL [6] 87.1%
3D CNN [7] 89.8%
Transfer learning [8] 91.5%
Radiomics [9] 90.9%
Pro
p
osed methodolo
gy
94.2%
Figure 4: Precision analysis.
Figure 4 and table 2 illustrates the precision
values obtained by different classification models in
diagnosing COPD. Precision: The (true positive
predictions / all positive predictions) is the metric
for measuring precision, i.e how well the system
served false positives. The proposed methodology
also has better specificity in identifying COPD
cases and lower misclassification rates compared to
traditional approaches, as shown in the graph. For
clinical use, it is essential to minimize false-
positives so that we may effectively care for
patients.
4.6.3 Recall (Sensitivity)
Recall is defined to measure how well a method can
identify the positive instances in dataset.
Recall =
 
(

)

(

)
 
(13)
Higher recall indicates the model is effective at
identifying COPD cases.
80.00%
82.00%
84.00%
86.00%
88.00%
90.00%
92.00%
94.00%
96.00%
98.00%
Accuracy values (%)
75.00%
80.00%
85.00%
90.00%
95.00%
100.00%
Precision values (%)
ICRDICCT‘25 2025 - INTERNATIONAL CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION,
COMMUNICATION, AND COMPUTING TECHNOLOGIES
214
Table 3: Recall (Sensitivity) Comparison of COPD
Diagnosis Approaches.
Methods Recall values (%)
Handcrafted features
+classical ML [5]
80.3%
End to End DL [6] 85.6%
3D CNN [7] 88.9%
Transfer learning [8] 90.7%
Radiomics [9] 91.2%
Proposed
methodolog
y
94.8%
Figure 5: Recall analysis.
As shown in Figure 5 and table 3 the recall
(sensitivity) values among the different classification
methods are compared. Model recall measures the
ability of the model to identify actual cases of COPD
in the dataset among all positive cases. As shown in
the figure, the hybrid model performed best in terms
of recall which indicates that it is able to detect most
of the true COPD cases (both severe and very
severe) as well as false COPD cases. The high recall
value means that fewer COPD cases are missed, so
the model is appropriate for early disease detection
and for assessing severity.
4.6.4 F1-Score
The F1-score is the harmonic mean of precision and
recall, balancing both metrics:
F1 − Score = 2 ×

×
(14)
F1-score is particularly useful for imbalanced
datasets, as it provides a single measure of the
model’s performance.
Table 4: F1-Score Analysis for Various COPD Detection
Techniques.
Methods F1-score values (%)
Handcrafted features
+classical ML [5]
81.4%
End to End DL [6] 86.3%
3D CNN [7] 89.3%
Transfer learning [8] 91.1%
Radiomics [9] 91.0%
Proposed methodology 94.5%
Figure 6: F1-Score analysis.
Figure 6 and table 4presents the comparison of
different COPD classification models under F1-
score. In the case of imbalanced datasets, the F1
score is the harmonic mean of precision and recall,
and it is a balanced measure of performance for the
model. The figure shows that the proposed approach
attained the highest F1 score, indicating that it has
better capability to detect true COPD cases as well
as to reduce false positives. As a result of this
balanced performance, mitochondrial BACs provide
a robust solution for the use in real world medical
applications where sensitivity and specificity are
equally important.
4.6.5 Specificity
Specificity measures the ability of the model to
correctly identify true negatives:
𝑆𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦 =
 
  
(15)
This metric is critical in medical applications to
ensure healthy patients are not misdiagnosed.
70.00%
75.00%
80.00%
85.00%
90.00%
95.00%
100.00%
Recall values (%)
70.00%
75.00%
80.00%
85.00%
90.00%
95.00%
100.00%
F1-score values (%)
Automated COPD Diagnosis from CT Scans: A Hybrid Deep Learning and Machine Learning Approach with Explainable AI
215
Table 5: Specificity Evaluation of COPD Classification
Models.
Methods Specificity (%)
Handcrafted features
+classical ML [5]
83.1%
End to End DL [6] 86.9%
3D CNN [7] 90.1%
Transfer learning [8] 92.4%
Radiomics [9] 91.7%
Proposed
methodolog
y
95.0%
Figure 7: Specificity analysis.
An example of specificity values that can be
obtained by different classification techniques for
COPD diagnosis is shown in Figure 7 and table 5.
Specificity measures how well the model
discriminates between the non-COPD cases and
does not allow false positives. Specificity score, as
illustrated in the figure, shows that the hybrid
method is the most specific for identification of
healthy individuals as opposed to COPD patients,
therefore making it reliable. In particular clinical
practice, this characteristic is important as it
prevents that non-COPD patients are not
misdiagnosed or exposed to unnecessary therapies.
5 RESULTS AND
OBSERVATIONS
The hybrid approach consistently outperforms
standalone CNNs and classical ML with handcrafted
features. This indicates that a deep feature and SVM
combination yields a higher performance on
precision and recall, especially in predicting severe
and very severe COPD. Grad-CAM heatmaps show
that the model attends to lung regions affected by
emphysema, and areas of airway thickening. #This is
the explanation behind SHAP values SHAP values
measure the impact of each feature, in this case
structural abnormalities, towards the classification.
The hybrid approach shows efficient computation
time as feature extraction is done using the CNNs
and the classification is performed by the SVMs.
6 CONCLUSIONS
It also proposes to implement a deep learning and
machine learning framework for CT based
automated diagnosis and severity assessment of
chronic obstructive pulmonary disease (COPD). By
employing traditional machine
learning classifiers
(SVM) with deep feature extraction using pre-
trained Convolutional Neural Networks (CNN), the
accuracy and reliability of the diagnosis would be
increased. The proposed hybrid approach is shown
to enhance accuracy, precision,
recall, and
specificity outperformance through experimental
results over deejp learning techniques, and classical
machine learning techniques independently. SHAP
and Grad-CAM are used to ensure interpretability
and transparency. These techniques are more
applicable to clinical decision-making
because it
enables one to view critical lung regions of where
emphysema and airway disease occur.
Future work may thereby improve
generalizability of method by merging such data
sources as pulmonary function test and clinical
parameters to further increase dataset size and
include additional population diversity. The clinical
use of it and its effect on patient
management
would also require more clinical validation. The
methodologies developed using the techniques in
this paper stand as major advancements in
leveraging
computational techniques applied to
medical imaging with scalable solution for the
diagnosis of COPD.
REFERENCES
L. Ramachandran, S.P. Mangaiyarkarasi, A. Subramanian,
S. Senthilkumar, “Shrimp Classification for White
Spot Syndrome Detection Through Enhanced Gated
Recurrent Unit-based wild Geese Migration
Optimization Algorithm”, Virus Genes, Vol. 60, No. 2,
pp. 134-147, 2024. DOI: https://doi.org/10.1007/s112
62-023-02049-0.
76.00%
78.00%
80.00%
82.00%
84.00%
86.00%
88.00%
90.00%
92.00%
94.00%
96.00%
Specificity (%)
ICRDICCT‘25 2025 - INTERNATIONAL CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION,
COMMUNICATION, AND COMPUTING TECHNOLOGIES
216
S. Senthilkumar, S. Vetriselvi, K. Kalaivani, P.
Arunkumar, M. Malathi, S. Praveen Kumar, “Super-
Resolution Image using Enhanced Deep Residual
Networks and the DIV2K Dataset”, Proceedings of the
Second International Conference on Self Sustainable
Artificial Intelligence Systems (ICSSAS-2024),
October 23 & 24, 2024.10.1109/ICSSAS64001.2024.1
0760961.
Manoharan, S., 2020. Improved version of graph-cut
algorithm for CT images of lung cancer with clinical
property condition. Journal of Artificial Intelligence, 2
(04), pp.201-206.
Immanuel D, J. and Leo E, S.A., 2024. An Intelligent
Heart Disease Prediction by Machine Learning Using
Optimization Algorithm. Journal of Information
Technology Management, 16(1), pp.167-181.
Kumar, S., Bhagat, V., Sahu, P., Chaube, M.K., Behera,
A.K., Guizani, M., Gravina, R., Di Dio, M., Fortino,
G., Curry, E. and Alsamhi, S.H., 2024. A novel
multimodal framework for early diagnosis and
classification of COPD based on CT scan images and
multivariate pulmonary respiratory diseases. Computer
Methods and Programs in Biomedicine, 243,
p.107911.
Deng, X., Li, W., Yang, Y., Wang, S., Zeng, N., Xu, J.,
Hassan, H., Chen, Z., Liu, Y., Miao, X. and Guo, Y.,
2024. COPD stage detection: leveraging the auto-
metric graph neural network with inspiratory and
expiratory chest CT images. Medical & Biological
Engineering & Computing, pp.1-17.
Bozkurt, F., 2022. A deep and handcrafted features‐based
framework for diagnosis of COVID‐19 from chest
x‐ray images. Concurrency and Computation:
Practice and Experience, 34(5), p.e6725.
Si, Ke, Ying Xue, Xiazhen Yu, Xinpei Zhu, Qinghai Li,
Wei Gong, Tingbo Liang, and Shumin Duan. "Fully
end-to-end deep-learning-based diagnosis of
pancreatic tumors." Theranostics 11, no. 4 (2021):
1982.
Gan, W., Wang, H., Gu, H., Duan, Y., Shao, Y., Chen, H.,
Feng, A., Huang, Y., Fu, X., Ying, Y. and Quan, H.,
2021. Automatic segmentation of lung tumors on CT
images based on a 2D & 3D hybrid convolutional
neural network. The British Journal of
Radiology, 94(1126), p.20210038.
Hossain, M.B., Iqbal, S.H.S., Islam, M.M., Akhtar, M.N.
and Sarker, I.H., 2022. Transfer learning with fine-
tuned deep CNN ResNet50 model for classifying
COVID-19 from chest X-ray images. Informatics in
Medicine Unlocked, 30, p.100916.
Sedghighadikolaei, K. and Yavuz, A.A., 2024. Privacy-
preserving and trustworthy deep learning for medical
imaging, IEEE conference, vol.1, no.1, pp.1-10.
Wang, S., Li, W., Zeng, N., Xu, J., Yang, Y., Deng, X.,
Chen, Z., Duan, W., Liu, Y., Guo, Y. and Chen, R.,
2024. Acute exacerbation prediction of COPD based
on Auto-metric graph neural network with inspiratory
and expiratory chest CT images. Heliyon, 10(7).
Automated COPD Diagnosis from CT Scans: A Hybrid Deep Learning and Machine Learning Approach with Explainable AI
217