Combined Unsupervised and Supervised Learning for Improving

Chest X-Ray Classification

Anca Ignat

and Robert-Adrian Găină

Faculty of Computer Science, University “Alexandru Ioan Cuza” of Iași, Romania

Keywords: Chest X-Ray, Pneumonia, Deep Learning Networks, Support Vector Machines (SVM), Random Forest (RF),

Clustering Methods.

Abstract: This paper studies the problem of pneumonia classification of chest X-ray images. We first apply clustering

algorithms to eliminate contradictory images from each of the two classes (normal and pneumonia) of the

dataset. We then train different classifiers on the reduced dataset and test for improvement in performance

evaluators. For feature extraction and also for classification we use ten well-known Convolutional Neural

Networks (Resnet18, Resnet50, VGG16, VGG19, Densenet, Mobilenet, Inception, Xception, InceptionResnet

and Shufflenet). For clustering, we employed 2-means, agglomerative clustering and spectral clustering.

Besides the above-mentioned CNN, linear SVMs (Support Vector Machines) and Random Forest (RF) were

employed for classification. The tests were performed on Kermany dataset. Our experiments show that this

approach leads to improvement in classification results.

1 INTRODUCTION

Image processing methods allow us to perform some

operations over an image in order to extract relevant

information. One domain that highly benefits from

this type of techniques is Medical Imaging. Medical

images make up most of the data available in

healthcare. CT, MRI and X-Ray are only some

examples of files that are used for in-depth

non-invasive exploration of internal anatomy. This

way of transmitting data about a patient is available

for medical staff to improve treatment.

The chest X-ray (CXR) is one of the most

commonly accessible radiological examinations used

in diagnosis lung diseases or respiratory symptoms.

The benefits of this type of examinations that made it

one of the most popular methods used even today are:

cost-effectiveness, low radiation dose and ease of

pathology detection from the results obtained.

Pneumonia is one of the most common lung

diseases caused by viruses, bacteria or fungi.

Pneumonia causes the inflammation of either one or

both of the lungs with fluids that can cause cough and

difficulty in breathing. When suffering from

pneumonia, the exchange rate between carbon

https://www.news-medical.net/health/Pneumonia-Epide

miology.aspx

dioxide (CO2) and oxygen (O2) at the alveolar

membrane level decreases. Pneumonia is treated

based on the pathogen that caused the infection. Thus,

for bacterial pneumonia antibiotics are used whilst for

viral pneumonia antiviral drugs are administrated and

antifungal drugs for the fungal pneumonia. Anyone

can be affected by this disease though children and

people over 65 are more susceptible of contacting it

while lowering the chances of survival. A common

approach used by radiologists in order to recognize a

patient suffering from pneumonia is the analysis of a

CXR. The main difference is based on the presence

of white hazy regions that are not present in lungs of

a healthy person.

The presence of pneumonia in the human body is

a classification problem that radiologists face daily.

However, due to a high demand in examinations and

low number of experts able to give a diagnostic, there

is a direct relation between the infection rate caused

by pneumonia and the medical infrastructure

available over a certain region

. Because of this,

recent studies focused on developing classification

models able to recognize pneumonia based on CXRs.

The decisions usually made by medical staff can be

helped by computer-aided diagnosis (CAD) tools that

Ignat, A. and G

ain

a, R.

Combined Unsupervised and Supervised Learning for Improving Chest X-Ray Classiﬁcation.

DOI: 10.5220/0011793000003417

In Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2023) - Volume 4: VISAPP, pages

475-482

ISBN: 978-989-758-634-7; ISSN: 2184-4321

 2023 by SCITEPRESS – Science and Technology Publications, Lda. Under CC license (CC BY-NC-ND 4.0)

475

can output a verdict based on the data given. A flow

that a CAD system usually performs has three steps.

In the first step, the CXR is pre-processed in order to

prepare it for the next operations (Caseneuve et al.,

2021). This implies resizing, reorienting or colour

correcting operations in order to standardize images

from multiple sources.

In the second step, features are extracted from the

pre-processed image. The last step consists of

analysing the features extracted in order to output a

final result. Some relevant surveys that summarize

recent progress in CXR image analysis are in (Khan,

Zaki, Ali, 2021), (Çallı et al., 2021), (Li et al. 2020).

In these papers one finds lists of datasets with CXR

images, methods used for solving different types of

problems that are CXR related and results obtained

with several performance evaluators.

(Ayan and Ünver, 2019) use transfer learning and

fine-tuning on Xception and VGG16 for detecting

pneumonia. The tests are computed on Kermany

(Kermany, Zhang, Goldbaum, 2018) dataset. In

(Sharma, Jain, Bansal, 2020) different variations of a

CNN architecture are presented with the use of

neurons dropout. All the models with dropout

outperform the normal model but the quality of the

results depends on the probability of dropout. The

experiments are performed on Kermany dataset with

training and test sets different from those used in this

article. (Hammoudi et al., 2021) compares VGG,

Densenet and Inception Resnet tailored for the

Kermany database in order to improve the results.

After training, the model is also tested with a dataset

containing CXR for adults, this dataset contains also

the medical history of each person. It is shown that

transfer learning can be efficient even if the data

sources are differently collected. In (Mabrouk et al.,

2022) the authors combine the results computed with

three neural networks to improve the pneumonia

classification results. They use Kermany dataset for

their computations. In (Kundu et al., 2021) the

authors also use ensemble methods with three neural

networks and test their method on Kermany and

RSNA datasets. The experiments are performed with

a 5-fold cross-validation procedure. (Couhan et al.,

2020) propose five models in order to classify the

same dataset that is discussed in this paper and after

analysing the results another meta-model that

combines the previous ones achieves state-of-the art

performance for the problem proposed, on Kermany

dataset. (Zhang et al., 2021) present an analysis of the

performance for different CNN architectures. They

https://www.kaggle.com/datasets/paultimothymooney/

chest-xray-pneumonia

pre-process the dataset and develop a smaller VGG-

like architecture for solving the pneumonia detection

problem. They use Kermany dataset with a random

split 70% for training, 10% for validation and 20%

for testing.

The main novelty of our work is the use of

clustering methods for the training set improvement,

and thus, obtaining better classification results.

The article has five sections. The first presents the

problem and the related work. The second section

presents the Kermany dataset employed in the

experiments. The clustering methods, the feature

extraction process, the classification procedure and

the methodology of combining them are described in

Section 3. Section 4 is dedicated to the results of our

tests. The final section presents conclusions and

future directions of research.

2 DATASET

We used in our tests the Kermany dataset (Kermany,

Zhang, Goldbaum, 2018). This dataset was

downloaded from Kaggle

. This is a data collection of

grayscale chest X-ray images. The images are in

JPEG format, the sizes of these images are variable.

The dataset is divided in three folders: test, val and

train. In each of these folders there are two

directories, labelled “normal” and “pneumonia”. We

merged the train and val folders thus obtaining a

training set with 1349 chest X-ray images of normal

lungs and 3883 images for lungs affected by

pneumonia. In the test set there are 234 images

labelled “normal” and 390 with “pneumonia”. We

did not apply any pre-processing enhancement

method.

In Fig. 1 are two samples of images from this

dataset, the first is a chest X-ray for normal lungs and

the second is for lungs with pneumonia.

3 METHODS

The idea behind our computations is based on the

observation that on the same medical image different

physicians can establish different diagnostics. When

working with medical image datasets, the images

have one label, not always knowing how this label

was settled.

VISAPP 2023 - 18th International Conference on Computer Vision Theory and Applications

476

Normal lung chest X-ray image

Pneumonia affected lung chest X-ray image

Figure 1: Samples of images from the Kermany dataset.

In order to increase the coherence of the dataset,

we intend to apply clustering methods on the training

set of images, without taking into account the original

labels of these images. We hope that the clustering

algorithms will grasp hidden patterns in the CXR

images, patterns that better define the differences

between the healthy lungs and the pneumonia

affected lungs.

The clustering methods are applied for dividing

the training set in two clusters. Then, we retain in a

given class, only the images that were also selected

by the clustering algorithm. Thus, one obtains a

reduced training set, assumingly more consistent than

the original one. We consider that the new content of

the two classes is formed by images which are better

representatives of the modelled reality.

In this work we tested three clustering techniques.

The first is k-means (in our case, 2-means) with two

distances, Euclidean and Manhattan (Tan, Steinbach,

and Kumar, 2019). The second algorithm is

agglomerative clustering, with four types of cluster

distances (Landau et al., 2011). The third technique is

spectral clustering (Wierzchoń and Kłopotek, 2018).

The k-means algorithm is an iterative method that

starts with k initial points and then it aggregates the

data around these points. One computes the centroids

of these clusters and computes a new cluster

distribution. The process iterates until a certain

cohesion criteria is fulfilled or a maximal number of

iteration is achieved.

The agglomerative clustering starts with a number

of clusters equal with the number of points in data.

Then it aggregates two of these clusters that are at a

minimum distance. At each step one has at least one

cluster less than at the previous step. The process

stops when the required number of clusters is

achieved. The resulting clusters content depends on

the chosen distance between clusters. We used four

distances. For the first, the distance between clusters

is given by the closest points of the two clusters. This

method is called single linkage. The second distance

is defined by the furthest points (complete linkage),

the third computes the distance by averaging all the

distances of points from different clusters (average

linkage). The forth method of computing a distance

measure between clusters is called Ward distance and

depends on the centroids of the two clusters and the

number of elements in each cluster.

The spectral clustering algorithm is a technique

borrowed from graph theory, where the nodes are

grouped in separate sets by computing eigenvalues

and eigenvectors of the Laplacian matrix associated

to the graph and keeping a reduced number of the

eigenvalues, eigenvectors. In our particular situation,

the nodes of the graph are the CXR images, the edges

are defined by the similarity matrix (or distance

matrix) between the images in the training set, this

matrix is considered a generalized adjacency matrix

for the graph.

We tested all these clustering algorithms on our

dataset, and in all computations the single linkage

algorithm performed poorly, one of the clusters

having very few or no elements. Thus, we do not

provide in this paper results for this clustering

technique.

For applying clustering methods one needs

features for image characterization. For this purpose,

we first trained several well-known neural networks

on the given dataset. We used two methods from the

Resnet family, Resnet18 and Resnet50 (He, Zhang,

Ren, Sun, 2016), two VGG networks, VGG16 and

VGG19 (Simonyan, Zisserman, 2014), three

Inception type methods, Inception v3, Xception, and

InceptionResnet v2 (Szegedy et al, 2016), (Chollet,

2017), Densnet201 (Huang et al., 2017), Mobilenet

v2 (Sandler et al., 2018) and Shufflenet (Zhang,

Zhou, Lin, Sun, 2018). If one passes an image through

the network and stops the network’s evolution before

the classification step, one obtains feature vectors for

that image. Thus, we obtain ten sets of feature

vectors, for each set we performed the classification

process.

For clustering purposes, we used for feature

extraction only five of the above mentioned networks:

Resnet18, Resnet50, VGG16, InceptionResnet v2 and

Densenet201.

Combined Unsupervised and Supervised Learning for Improving Chest X-Ray Classiﬁcation

477

Besides the five networks mentioned above, we

also trained the following networks: VGG19,

Mobilenet, Inception v3, Xception and Shufflenet.

For all these networks we performed data

augmentation before training them and a 70%-30%

split for the validation process. For augmentation

translations and scaling were applied. We used a

standard set of values for the usual parameters (batch

size, learning rate). All the networks were trained

under the same conditions. Our interest was to obtain

good classification results with these networks, but

our main goal was to test if the selection process

performed on the dataset using clustering methods

can improve the results obtained on the original

dataset. The majority of tests were performed by

using these ten networks for feature extraction and

then employing linear SVM and Random Forest for

classification. We also computed classification

results using Resnet18 trained on some of the reduced

training sets. We monitored if there are improvements

of the performance evaluators.

As performance evaluators we used the accuracy

(the percentage of correctly identified chest X-ray

images from the total number of test images) and the

Area Under the Curve ROC (Receiver Operating

Characteristics), abbreviated AUC.

4 RESULTS

The experiments were performed on a laptop, with

Intel(R) Core(TM) i7-8565U CPU, with a NVIDIA

GeForce RTX 3060 GPU, in MATLAB 2020.

All the networks were trained 60 epochs, using the

Stochastic Gradient Descent optimizer, mini batches

of size 16, and cross-entropy loss function.

We adopted the following abbreviations for the

clustering methods: E for 2-means with Euclidean

distance, M – 2-means with Manhattan distance, C –

complete linkage, A – average linkage, W – Ward

linkage, S – spectral clustering.

All the results presented on this section are

obtained on the test set, using the ten networks for

feature extraction.

We began by computing the classification results

using the original, unselected dataset, employing the

trained neural networks, the linear SVM and Random

Forest. For the last two classification methods we

used the feature vectors produced by the same

convolutional neural network, stopped before the

classification layers. In Table 1 are the accuracies and

in Table 2 the AUC values for these methods.

As expected, the trained networks provide the

better results than the SVM and RF, Shufflenet

producing the best overall result.

Table 1: Classification results on the original dataset:

accuracies (%).

Metho

Networ

SVM RF

Resnet18 94.87 85.58 89.90

Resnet50 94.07 91.35 89.74

VGG16 94.23 89.58 89.90

VGG19 96.63 88.46 90.87

Densenet 96.47 93.59 92.15

Mobilenet 95.03 92.15 90.38

Inception 94.53 93.11 91.51

Xce

tion 95.51 90.87 89.42

IncResnet 94.87 91.99 91.83

Shufflenet 96.79 91.99 89.26

Table 2: Classification results on the original dataset: AUC

values.

Metho

Networ

SVM RF

Resnet18 0.9367 0.8628 0.8654

Resnet50 0.9243 0.8855 0.8632

VGG16 0.9256 0.8628 0.8662

VGG19 0.9576 0.8470 0.8791

Densenet 0.9572 0.9171 0.8970

Mobilenet 0.9380 0.8962 0.8726

Inception 0.9551 0.9107 0.8876

Xception 0.9444 0.8808 0.8624

IncResnet 0.9367 0.8966 0.8936

Shufflenet 0.9623 0.8940 0.8585

After performing the selection process described

in the previous section, we were interested to see how

many images in each class were selected. The new

dimensions for the new training set classes are shown

in Table 3. The first number in the cell is the number

of normal chest X-ray images selected and the second

is those with pneumonia. Note that all the clustering

methods and networks select the great majority of the

normal images in the first class. The complete and

average linkage have difficulties in distinguishing

between normal and pneumonia affected lungs. The

VGG19 network has a good selection capability,

regardless of the clustering method. We consider a

good selection one that produces a pneumonia class

that has at least 2500 images.

We performed classification tests using all the

clustering methods and all the 10 types of deep

learning generated features, with both SVM and RF.

Table 4 and Table 5 show the best classification

results obtained and the average accuracies for the

five clustering method we tested. For the best results

we show the accuracy value, the network that

produced the feature vectors for classification and the

VISAPP 2023 - 18th International Conference on Computer Vision Theory and Applications

478

network that produced the feature vectors for the

selection process. The average value is over the

accuracies produced with the five networks employed

in our computations for reducing the training set, for

each clustering method. Table 4 shows the results

obtained using the linear SVM classifier and Table 5

present the results for RF.

Table 3: New sizes of the dataset after the unsupervised

selection process.

E M C A W S

Resnet18 1349/

2816

1349/

3259

1349/

739

1349/

2440

1307/

146

Resnet50 1346/

2431

1342/

3338

1345/

1348/

1345/

2176

1244/

293

VGG19 1347/

3796

1347/

3765

1342/

3864

1338/

3869

1342/

3870

1007/

973

IncResnet 1341/

3597

1342/

3709

1349/

1348/

Densenet 1347/

3550

1347/

3673

1348/

1214

1349/

1334/

3869

1246/

366

Table 4: SVM results for different clustering techniques.

Best Average

Accuracy

Feat net Selection

E 96.79% VGG19

Densenet

Resnet50

Resnet18

94.79%

M 96.31% VGG16

Densenet

Shufflenet

Resnet18

94.34%

C 96.96% VGG16

Densenet

Ince

tion

Densenet

78.59%

A 93.91% Densenet VGG16 57.66%

W 97.12% VGG16

Densenet

Resnet18

90.75%

S 96.96% Densenet Resnet18 90.71%

Table 5: RF results for different clustering techniques.

Best Average

Accuracy

Feat net Selection

E 96.31% Densenet Resnet50 93.25%

M 95.51% VGG19 Resnet18 92.43%

C 96.79% VGG19 Resnet18 75.09%

A 92.63% VGG19

Densenet

Resnet18

VGG16

53.88%

W 96.15% Densenet Resnet50 87.06%

S 96.63% Resnet18

Mobilenet

Ince

tion

Resnet18

Densenet

Resnet17

88.58%

We then studied which type of network employed

for feature extraction provided the best classification

result for each clustering method. In Table 6 are this

type of dependency using the SVM classifier, and in

Table 7 are the results for the RF. We use the

following abbreviations for the five networks

employed in the selection process: R18 and R50 for

Resnet18 and Resnet50, respectively, D for Densenet,

IR for InceptionResnet.

Note that certain clustering methods favour

certain selection networks. For example, the 2-means

clustering and Ward agglomerative method favour

the Resnet class selection networks for both SVM and

RF. Complete linkage favours features extracted with

Densenet and average linkage prefers the VGG16

network. One notes that in these tables the

InceptionResnet selection does not appear. The

results presented in these two table are similar but not

identical. From these tables one can deduce that a

good choice for feature extraction in the selection

process is a network from the class Resnet.

Table 6: Networks that provided best SVM accuracy results

for each clustering method.

E M C A W S

Resnet18

R50

R18 D Vgg16 R50 R18

Resnet50 R18 R18 D Vgg16 R18 R50

VGG16 R50 R18 D Vgg16 R18 R50

VGG19 R50 Vgg16 D Vgg16 R18 R18

Densenet R18 R18 D Vgg16 R18 R18

Mobilenet R18 R18 D Vgg16 R50 R18

Inception R50 R18 D Vgg16 R50 D

Xception R18 R18 D Vgg16 R18 R18

IncResnet R18 R18 D Vgg16 R50 R50

Shufflenet R18 R18 D Vgg16 R18 R50

Table 7: Best RF accuracy results for each clustering

method.

E M C A W S

Resnet18

R50

R18 D Vgg16 R18 R18

Resnet50 R18 R18 D Vgg16 R18 R50

VGG16 R50 R18 D R18 R50 R18

VGG19 R18 R18 R18 R18 R18 R18

Densenet R50 R18 R18 Vgg16 R50 D

Mobilenet R50 R18 D Vgg16 R18 D

Inception R50 R18 D Vgg16 R50 R50

Xception R50 R18 R18 Vgg16 R50 R50

IncResnet R50 R18 D Vgg16 R18 D

Shufflenet R50 R18 D Vgg16 R18 D

Combined Unsupervised and Supervised Learning for Improving Chest X-Ray Classiﬁcation

479

This choice will be more evident from Table 8,

where the best accuracy results depending on the

clustering method are presented. The best

improvement in accuracy result (97.12%) was

obtained by using SVM on a dataset selected with

Resnet50 or Resnet18 features with Ward or spectral

clustering, the classification feature computed with

VGG16 or Densenet. Comparing these results with

those in Table 1, we deduce that there is an overall

significant improvement in accuracy. The same

happens when comparing the AUC values.

Table 8: Networks that provided best SVM accuracy results

for each clustering method.

SVM RF

Acc. Clust met Acc. Clust met

Resnet18 96.15% M 96.63% S

Resnet50 96.15% W 95.99% S

VGG16 97.12% W, S 96.63% C

VGG19 96.79% E 96.96% C

Densenet 97.12% W 96.15% S

Mobilenet 96.15% E 96.47% S

Inception 96.96% C 95.83% C

Xception 96.63% E 95.83% S

IncResnet 95.03% E 94.71% E

Shufflenet 96.63% E 95.99% S

We present in the next figures the confusion

matrices for classification results obtained on the

whole dataset and on datasets selected using different

clustering methods.

We show the confusion matrices for the classifiers

that provided the best results: Shufflenet, for the

original dataset (see Fig. 2), Densenet (Fig. 3 and 4)

and VGG16 for the selected dataset (Fig. 5). Because

in the selection process the main reduction was

performed on the class of pneumonia images, we

expect that the improvement in accuracy is due to the

decrease of the false positive.

Figure 2: Confusion matrix for Shufflenet, trained on the

original dataset.

We trained Resnet18 on different datasets selected

with different networks for feature extraction and

clustering methods. We chose those selected datasets

that have enough images in the pneumonia class. The

results (accuracy and AUC) are in Table 9. Almost in

all situations the results are better than those obtained

by training Resnet18 on the original, unselected

dataset. For 2-means with Manhattan distance and

feature extracted with Resnet50 we obtain the best

overall result, 97.44%.

Figure 3: Confusion matrix for Densenet201, trained on the

dataset selected with Resnet18 features and Ward

clustering.

Figure 4: Confusion matrix for Densenet201, trained on the

dataset selected with Resnet50 features and Ward

clustering.

Figure 5: Confusion matrix for VGG16, trained on the

dataset selected with Resnet18 features and Ward

clustering.

VISAPP 2023 - 18th International Conference on Computer Vision Theory and Applications

480

Table 9: Resnet18 accuracy results obtained using training

sets generated with different networks and clustering

methods.

E M C A W

Resnet18

95.99%

0.962

96.79%

0.9658

- - 86.38%

0.8868

Resnet50 91.83%

0.9286

97.44%

0.9709

- - 91.19%

0.9252

VGG16 96.15%

0.9538

96.15%

0.9530

96.15%

0.9521

95.03%

0.9372

96.31%

0.9551

IncResnet 96.63%

0.9594

96.31%

0.956

- - 96.96%

0.9654

Densenet 95.67%

0.9457

87.66%

0.8987

96.15%

0.9607

- 95.99%

0.956

In Figure 6 is the confusion matrix for the best results

(Resnet18 trained on a training set selected with 2-

means with Manhattan distance, with features

produced by Resnet50). The false positives remain

rather high, and this good result is due to the decrease

of false negatives. We compared the results obtained

with the same network, Resnet 18, but the training

process was performed on the entire dataset. The

confusion matrix is in Fig. 7. Note that the false

negative are the same, the improvement of 2.5% in

accuracy is due to the decrease of the false positive.

Figure 6: Confusion matrix for Resnet18, trained on the

dataset selected with Resnet50 features and 2-means with

Manhattan distance.

Figure 7: Confusion matrix for Resnet18, trained on the

original dataset.

We compare our result with those obtained in

other studies. Although there are articles that report

classification results on the same dataset, in most of

the cases the test set is not the same. The best results

obtained for the same dataset, with the same split

training-test as ours are in (Mabrouk et al., 2022) with

accuracy 93.91% on the test set, and in (Chouhan,

2020) the reported accuracy is 96.39% with 0.9934

AUC. We could improve our results, by improving

the results in Table 1. The results obtained in Table 1

can be improved by pre-processing the images in the

original dataset, use a better augmentation procedure,

and training the networks with different parameters

(increasing the number of epochs, for example).

5 CONCLUSIONS

In this article, we tested the influence of clustering

methods on the classification of chest X-ray image.

We used three clustering techniques (with different

underlying distances) to reduce the training set. In the

same time, these unsupervised algorithms increased

the training set’s coherence, by deleting the

inconsistent information. The feature extraction

process was carried out with well-known CNNs,

trained on the original dataset, with standard

parameters. Linear SVMs, Random Forest and CNN

methods provided the classification results. This

blend of unsupervised with supervised learning

computed better accuracy results on the Kermany

dataset.

We intend to test this method on other datasets

and with other clustering techniques. A

pre-processing step can be added to the images in the

dataset (histogram equalization, contrast

enhancement, noise reduction) to improve the

classification results. The method we presented tends

to ignore the outliers, the non-standard CXR images.

In order to address this problem, we plan to

investigate the reformulation of this problem as a 3 or

4-class problem, by also using the images that now

we eliminate from the training set, hoping to include

in this new classes the atypical cases.

REFERENCES

Lee, J. E., Kim, T. H., Cho, K. H., Han, K. T., & Park, E.

C. (2017). The association between number of doctors

per bed and readmission of elderly patients with

pneumonia in South Korea. BMC health services

research, 17(1), 1-11.

Combined Unsupervised and Supervised Learning for Improving Chest X-Ray Classiﬁcation

481

Caseneuve, G., Valova, I., LeBlanc, N., & Thibodeau, M.

(2021). Chest X-Ray Image Preprocessing for Disease

Classification. Procedia Computer Science, 192, 658-

665.

Khan, W., Zaki, N., & Ali, L. (2021). Intelligent pneumonia

identification from chest x-rays: A systematic literature

review. IEEE Access, 9, 51747-51771.

Çallı, E., Sogancioglu, E., van Ginneken, B., van Leeuwen,

K. G., & Murphy, K. (2021). Deep learning for chest X-

ray analysis: A survey. Medical Image Analysis, 72,

102125.

Li, Y., Zhang, Z., Dai, C., Dong, Q., & Badrigilan, S.

(2020). Accuracy of deep learning for automated

detection of pneumonia using chest X-ray images: a

systematic review and meta-analysis. Computers in

Biology and Medicine, 123, 103898.

Ayan, E., & Ünver, H. M. (2019, April). Diagnosis of

pneumonia from chest X-ray images using deep

learning. In 2019 Scientific Meeting on Electrical-

Electronics & Biomedical Engineering and Computer

Science (EBBT) (pp. 1-5). Ieee.

Sharma, H., Jain, J. S., Bansal, P., & Gupta, S. (2020,

January). Feature extraction and classification of chest

x-ray images using cnn to detect pneumonia. In 2020

10th International Conference on Cloud Computing,

Data Science & Engineering (Confluence) (pp. 227-

231). IEEE.

Kermany, D., Zhang, K. G. M., & Goldbaum, M. Large

dataset of labeled optical coherence tomography (OCT)

and chest x-ray images, Mendeley data, v3 (2018).

Hammoudi, K., Benhabiles, H., Melkemi, M., Dornaika, F.,

Arganda-Carreras, I., Collard, D., & Scherpereel, A.

(2021). Deep learning on chest X-ray images to detect

and evaluate pneumonia cases at the era of COVID-19.

Journal of medical systems, 45(7), 1-10.

Mabrouk, A., Díaz Redondo, R. P., Dahou, A., Abd Elaziz,

M., & Kayed, M. (2022). Pneumonia Detection on

Chest X-ray Images Using Ensemble of Deep

Convolutional Neural Networks. Applied Sciences,

12(13), 6448.

Kundu, R., Das, R., Geem, Z. W., Han, G. T., & Sarkar, R.

(2021). Pneumonia detection in chest X-ray images

using an ensemble of deep learning models. Plos one,

16(9), e0256630.

Chouhan, V., Singh, S. K., Khamparia, A., Gupta, D.,

Tiwari, P., Moreira, C., ... & De Albuquerque, V. H. C.

(2020). A novel transfer learning based approach for

pneumonia detection in chest X-ray images. Applied

Sciences, 10(2), 559.

Zhang, Dejun, Fuquan Ren, Yushuang Li, Lei Na, and Yue

Ma. "Pneumonia detection from chest X-ray images

based on convolutional neural network." Electronics

10, no. 13 (2021): 1512.

Tan, P. N., Steinbach, M., & Kumar, V. (2019).

Introduction to data mining. Pearson Education India,

ed.

Landau, S., Leese, M., Stahl, D., & Everitt, B. S. (2011).

Cluster analysis. John Wiley & Sons, 5

ed.

Wierzchoń, S. T., & Kłopotek, M. A. (2018). Modern

algorithms of cluster analysis (Vol. 34). Springer

International Publishing.

He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual

learning for image recognition. In Proceedings of the

IEEE conference on computer vision and pattern

recognition (pp. 770-778).

Simonyan, K., & Zisserman, A. (2014). Very deep

convolutional networks for large-scale image

recognition. arXiv preprint arXiv:1409.1556.

Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., & Wojna,

Z. (2016). Rethinking the inception architecture for

computer vision. In Proceedings of the IEEE

conference on computer vision and pattern recognition

(pp. 2818-2826).

Chollet, F. (2017). Xception: Deep learning with depthwise

separable convolutions. In Proceedings of the IEEE

conference on computer vision and pattern recognition

(pp. 1251-1258).

Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K.

Q. (2017). Densely connected convolutional networks.

In Proceedings of the IEEE conference on computer

vision and pattern recognition (pp. 4700-4708).

Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., & Chen,

L. C. (2018). Mobilenetv2: Inverted residuals and linear

bottlenecks. In Proceedings of the IEEE conference on

computer vision and pattern recognition (pp. 4510-

4520).

Zhang, X., Zhou, X., Lin, M., & Sun, J. (2018). Shufflenet:

An extremely efficient convolutional neural network

for mobile devices. In Proceedings of the IEEE

conference on computer vision and pattern recognition

(pp. 6848-6856).

VISAPP 2023 - 18th International Conference on Computer Vision Theory and Applications

482