Segmentation of Pneumothorax Disease based on Deep Learning

Yao Zhang

North China University of Technology, Beijing, China

Keywords: Pneumothorax Segmentation, Residual Module, Attention Mechanism.

Abstract: Pneumothorax is a common acute pulmonary disease. At present, chest X-ray is an important diagnostic

method of pneumothorax. The image of pneumothorax has the characteristics of uneven distribution, great

changes in the shape and size of lesions, and no obvious characteristics, which makes it difficult for doctors

to make early diagnosis. At the same time, the traditional image algorithm is not good for the extraction of

pneumothorax lesions. To solve the above problems, a deep learning based extraction method for

pneumothorax lesions was proposed. The feature extraction module is constructed by combining the

bottleneck module and improved coordatt attention mechanism, so that the neural network can fully capture

image features, which effectively solves the problem of inaccurate segmentation and extraction due to the

large variation of pneumothorax and the lack of obvious features. Experimental results showed that, on SIIM-

ACR Pneumothorax data set, the Dice index, Accuracy, Recall and Iou reached 85.67%, 92.42%, 87.25% and

81.37%, which proved that compared with other image semantic segmentation methods, Segmentation and

extraction of pneumothorax region results are more accurate.

1 INTRODUCTION

Pneumothorax is a common acute lung disease

(

Gilday 2021), which is fatal. The rapid diagnosis and

treatment of pneumothorax diseases can help to

ensure the safety of patients' lives and have practical

significance. At this stage, the main diagnostic

method for pneumothorax is the doctor's X-ray chest

radiograph. Compared with CT and NMR, X-ray is

inexpensive and has obvious advantages. At present,

the ratio of doctors to patients in China is seriously

imbalanced. Doctors need to diagnose a large number

of chest X-rays every day. The results of artificial

pneumothorax detection are easily affected by factors

such as doctors' experience and level, and are likely

to be missed or misdiagnosed. The failure of

radiologists to detect pneumothorax early is one of

the main causes of death from pneumothorax disease

(

Suthar 2016). Therefore, the Computer-Aided-

Diagnosis (CAD) system (

Chen 2021) should be used

in the automatic detection of clinical X-ray

pneumothorax to help doctors improve the efficiency

and accuracy of diagnosis and reduce missed

diagnosis.

In recent years, with the continuous development

of computer technology, convolutional neural

network models represented by LeNet5(Lecun 1998),

VGG16(Simonyan 2014) and GoogLenet (Szegedy

2015) have been used in the field of computer vision

and medicine. It has achieved success in the image

field, and the recognition effect has been greatly

improved compared with traditional methods. In

2012, Hinton and Krizhevsky used ReLU as the

network activation function, and successfully

proposed Local Response Normalization (LRN), and

AlexNet(Wang 2020), and used the Dropout layer for

the first time to deactivate some neurons and avoid

The model is over-fitting; Kaiming He released the

ResNet(He 2015) neural network based on the

residual module in 2015, which effectively solved the

problem that the gradient disappears when the neural

network reaches a certain depth. In the field of image

segmentation, Ronneberger (Ronneberger 2015) et al.

proposed a U-Net network for medical image

segmentation tasks based on the FCN architecture. It

improved FCN and improved the expansion path a

lot. Multi-channel convolution and similar feature

pyramid networks the structure is combined, and U-

Net can also achieve good results in training and

testing with a small amount of data sets, making a

great contribution to medical image segmentation. In

terms of pneumothorax segmentation, Wang (Wang

2020) et al. proposed a CheXLocNet convolutional

neural network based on Mask R-CNN for

162

Zhang, Y.

Segmentation of Pneumothorax Disease based on Deep Learning.

DOI: 10.5220/0011215500003444

In Proceedings of the 2nd Conference on Artiﬁcial Intelligence and Healthcare (CAIH 2021), pages 162-167

ISBN: 978-989-758-594-4

pneumothorax segmentation. The dice coefficient of

the test set on the SIIM-ACR Pneumothorax dataset

is 82%.

In order to solve the problem of uneven

distribution of pneumothorax image data, large

changes in the shape and size of the lesions,

unobvious features, and inaccurate segmentation of

small lesions, this paper proposes a pneumothorax

segmentation method based on residual module and

attention mechanism. This method adjusts the U-Net

network structure and uses the bottleneck residual

module to effectively extract the features of the

pneumothorax image and perform the semantic

segmentation of the pneumothorax region. At the

same time, the improved CoordAtt module is

embedded in the network to make full use of the

detailed information of the pneumothorax image to

further enhance the effect of pneumothorax

segmentation.

2 METHOD

This article is to realize the pathological segmentation

of pneumothorax disease. Taking into account the

uneven number of positive and negative samples in

the pneumothorax data set, pneumothorax images

have problems such as unclear boundaries, large

changes in shape and size, the final selection is widely

used in medical images and is used in small data. The

U-Net network with good performance on the set is

used as the basic network. The classic U-Net network

is a fully convolutional network segmentation model.

The first half of the network is a feature extraction

module, and the second half is an upsampling

module. This structure is also called an encoder-

decoder structure.

In this chapter, we will introduce the structure of

the improved U-net in detail and explain how the

network is improved. The improved structure is

shown in Figure 1.

This article is to realize the pathological

segmentation of pneumothorax disease. Taking into

account the uneven number of positive and negative

samples in the pneumothorax data set, pneumothorax

images have problems such as unclear boundaries,

large changes in shape and size, the final selection is

widely used in medical images and is used in small

data. The U-Net network with good performance on

the set is used as the basic network. The classic U-Net

network is a fully convolutional network

segmentation model. The first half of the network is

a feature extraction module, and the second half is an

upsampling module. This structure is also called an

encoder-decoder structure.

In this chapter, we will introduce the structure of

the improved U-net in detail and explain how the

network is improved. The improved structure is

shown in Figure 1

Conv 3×3,ReLU

Copy and crop

Up-conv 2×2

Max pool 2×2

Conv 1×1

CoordAtt-Module

N*Bottleneck

N=4

N=3

N=6

N=3

Figure 1: Improved U-Net structure.

The pneumothorax segmentation model proposed

in this paper is based on the residual module and

improved attention mechanism. While continuing the

symmetric structure and jump connection of U-Net,

the bottleneck residual module is embedded to

optimize the segmentation details, and the CoordAtt

module is added to improve the useless features,

perform compression. A total of 5 layers of

symmetrical structure from top to bottom. The

encoder of the first layer contains two 3x3

convolution kernels, and the encoders of the second

to fifth layers contain 3, 4, 6, and 3 bottleneck

residual modules, except for the first layer. Outside

of the first layer, after each layer of encoder

Segmentation of Pneumothorax Disease based on Deep Learning

163

completes the convolution calculation, it is

accompanied by an improved CA module to fully

learn the edge and texture features of the

pneumothorax and improve the network

segmentation performance.

2.1 Bottleneck Residual Module

The U-Net basic network model selected in this study

lost the detailed information in the pneumothorax

image during the encoding-decoding process,

resulting in a decrease in the accuracy of

pneumothorax segmentation. He et al. (He 2015)

proposed a residual connection module in 2015, as

shown in Figure 2.

Figure 2: ResNet structure drawing.

The ResNet module contains two paths, one path

directly adds the input image information to the

bottom layer of the module, and the other path that

contains the feature extraction function is added

together to form a residual short-circuit connection.

Adding Resnet's residual structure to U-Net's network

model can effectively alleviate the loss of details

caused by encoding-decoding, thereby directly

improving accuracy.

2.2 Improved Coordatt Attention

Mechanism

The attention mechanism is derived from the research

of human vision and is widely used in various fields

of deep learning (Krizhevsky 2012, such as image

processing, speech recognition, and natural language

processing. The attention mechanism helps the

convolutional neural network to extract and

recognize objects from complex images by assigning

different weights to different channels of the feature

map, and suppress invalid feature information. In

2021, Hou (Hou 202) et al. proposed CA (CoordAtt),

which embeds location information into channel

attention to help neural networks extract features

more efficiently at a lower cost.

CA uses two 1D global pooling operations to

generate two separate feature perceptions for the

input features along the vertical and horizontal

directions respectively. Then the two feature maps

with embedded specific direction information are

respectively encoded into two attention maps, and

finally both attention maps are applied to the input

feature maps through multiplication. However, CA

only performs global average pooling in the

calculation, and does not pay attention to the detailed

texture information. At the same time, in the process

of feature weighting, the original feature information

cannot be fully utilized. Therefore, this paper

improves the CA attention mechanism. Based on the

original CA, a residual network is added, and the

input features are globally averaged pooled and

maximum pooled, so that it can make fuller use of the

original features. And detailed information, the

structure is shown in Figure 3.

Input

Residual

Conv2d

X Max

Pool

Concat+Conv2d

BatchNorm+Non-linear

X Avg

Pool

Y Max

Pool

Y Avg

Pool

Conv2d

Sigmoid Sigmoid

Re-weight

Output

Figure 3: Improved CoordAtt Attention Mechanism.

1x1,64

3x3,64

Relu

1x1,256

Relu

CAIH 2021 - Conference on Artiﬁcial Intelligence and Healthcare

164

3 EXPERIMENTS

3.1 Data Set and Data Enhancement

The SIIM-ACR Pneumothorax dataset used in this

experiment is provided by Society for Imaging

Informatics in Medicine (SIIM) and American

College of Radiology (ACR), and is open sourced on

the Kaggle platform. The data set contains 12089

Digital Imaging and Communication sin Medicine

(DICOM) files, and the annotations are in RLE

encoding format.

Figure 4: Pneumothorax data set.

3.1.1 Data Set Preprocessing

The experiment first converts the DICOM file into a

512x512 PNG format image, and converts the RLE

encoding format label to a 512x512 label image.

Then, in the original data set, pictures with

pneumothorax accounted for only about 28%, and the

number of positive and negative samples was

seriously unbalanced, which would have a greater

impact on the convergence speed and effect of the

model. Therefore, in training, a sliding sampling

strategy is adopted for the data. Specifically, in the

early stage of model training, a 2:1 large positive and

negative sample ratio is used to randomly sample the

data to make the model converge faster. In the middle

and late stages, a 1:1 sampling ratio is adopted to

make the model more robust.

3.1.2 Image Enhancement

Since there are fewer pneumothorax pictures in the

data set, there is insufficient data for the network

model to learn, which makes the model easy to overfit

during the training process. Therefore, this article

performs data enhancement operations before model

training, adopts the methods of flipping, random

contrast, random gamma, random brightness, random

elastic transformation, random grid distortion, and

visual distortion, and performs data expansion work

to improve the positive sample data. The image

enhancement effect diagram is shown in Figure 5.

Original Fli pping

Random

Constrast

Random Gamma

Random

Brigthness

Random Elastic

Transformation

Random Grid

Distortion

Visual

Distortion

Figure 5: Data enhanced rendering.

3.2 Experimental Details

3.2.1 Experimental Environment

The hardware environment is NVIDIA 1080TI

graphics card, 11G running memory, Intel(R) Core

(TM) i7-7700K processor. The software environment

is Windows 10 system, Python 3.6, Pytorch 1.1

development environment.

3.2.2 Experimental Parameters

The optimizer in training uses the Adam optimizer.

The Adam optimizer has the advantages of fast

calculation and low memory footprint, and can

optimize the model while using a small amount of

computing resources. The learning rate adopts the

CosineAnnealingLR curve that comes with pytorch.

The learning rate change is shown in Figure 6.

Figure 6: CosineAnnealingLR curve.

3.3 Experimental Results

3.3.1 Comparison of Segmentation

Performance of Pneumothorax

After the training is completed, evaluate the

segmentation performance of the algorithm on the

SIIM-ACR Pneumothorax test set. The comparison

experiment results are shown in Table 1.

Segmentation of Pneumothorax Disease based on Deep Learning

165

Table 1: Experimental results of different models.

Model

Dice Precision Recall Iou

U-Net

82.36 89.56 84.51 78.85

CheXLoc

Net

82.82 90.36 84.78 79.17

Albunet

83.25 91.41 86.58 79.92

Ours

85.67 92.42 87.25 81.37

The Dice, Precision, Recall, and Iou of the

algorithm proposed in this paper on the test set are

85.67%, 92.42%, 87.25% and 81.37%, respectively.

It can be seen that the ICA-ResUnet proposed in this

paper is compared with the series of medical image

segmentation previously proposed. Network

performance has been greatly improved. Compared

with the original U-Net algorithm, it has increased by

3.31%, 2.86%, 2.74% and 2.52% respectively

Original

Mask

Albunet

Ours

Unet

CheXLocNet

Figure 7: Comparison of segmentation results.

The visualization of the segmentation results of

the four networks is shown in Figure 7. The first

column in the figure is the chest X-ray picture of the

input model, the second column is the real label of the

pneumothorax contour marked by medical experts,

and the last column is the result of the pneumothorax

segmentation in this article. Observing the first and

third lines, the unimproved U-Net network is more

likely to be affected by the inconspicuous features of

the pneumothorax due to insufficient feature

utilization, and there are cases of missed detection

and wrong detection. Albunet (Shvets 2018) and

CheXLocNet have better segmentation effects on

large-area pneumothorax, and the segmented shape

and edge are relatively close to the real label. But by

observing the first row, there are also small-scale

misdetections.

In contrast, the network model proposed in this

paper can effectively segment the pneumothorax

lesions and predict the contour of the pneumothorax

more accurately. Due to the small number of

pneumothorax images and the small area of the lesion

relative to the background area, it is difficult for the

deep learning model to extract features and feature

learning. The network uses the residual module and

attention mechanism to learn in the U-Net network,

strengthens the network's ability to extract

pneumothorax features, and is more suitable for

medical image segmentation tasks.

In summary, compared with other segmentation

algorithms, the segmentation effect of the

pneumothorax segmentation method proposed in this

paper has been significantly improved. It can fully

extract features and use detailed information. It can

be used in the segmentation of pneumothorax with

small area, low image quality, and blurred

boundaries. The process is more robust.

4 CONCLUSIONS

X-ray image segmentation of pneumothorax is a key

step to achieve accurate display, diagnosis, early

treatment and surgical planning of pneumothorax

diseases. This paper proposes a new X-ray

pneumothorax segmentation method. The neural

network model combines the bottleneck module and

the Improved CoordAtt Attention Model. Compared

with other medical image segmentation networks, the

feature extraction ability is greatly improved, so that

the network can effectively detect pneumothorax

lesions area. It performs well in the experiment of the

SIIM-ACR Pneumothorax data set. The above

CAIH 2021 - Conference on Artiﬁcial Intelligence and Healthcare

166

method has a certain generalization, not only suitable

for pneumothorax segmentation, but also has

reference value for other medical image segmentation

research.

The next step of research will continue to

optimize the network structure while focusing on

trying other data enhancement methods and the

choice of loss function.

REFERENCES

Chen Zhili, GAO Hao, PAN Yixuan, XING Feng.

Computer-aided diagnosis of breast X-ray image

technology review [J/OL]. Computer engineering and

application: 1-24 [2021-11-08].

http://kns.cnki.net/kcms/detail/11.2127.TP.20211025.

1000.004.html.

Gilday Cassandra and Odunayo Adesola and Hespel Adrien

Maxence. Spontaneous Pneumothorax:

Pathophysiology, Clinical Presentation and

Diagnosis[J]. Topics in Companion Animal Medicine,

2021: 100563-.):

Hou Q, Zhou D, Feng J. Coordinate attention for efficient

mobile network design[C]//Proceedings of the

IEEE/CVF Conference on Computer Vision and

Pattern Recognition. 2021: 13713-13722.

Krizhevsky A, Sutskever I, Hinton G E. Imagenet

classification with deep convolutional neural

networks[J]. Advances in neural information

processing systems, 2012, 25: 1097-1105.

Li Z, Zuo J, Zhang C, et al. Pneumothorax Image

Segmentation and Prediction with UNet++ and MSOF

Strategy[C]//2021 IEEE International Conference on

Consumer Electronics and Computer Engineering

(ICCECE). IEEE, 2021: 710-713.

Ronneberger O, Fischer P, Brox T. U-Net: Convolutional

networks for biomedical image

segmentation[C]//International Conference on Medical

image computing and computer-assisted intervention.

Springer, Cham, 2015: 234-241.

Shvets A A, Rakhlin A, Kalinin A A, et al. Automatic

instrument segmentation in robot-assisted surgery

using deep learning[C]//2018 17th IEEE International

Conference on Machine Learning and Applications

(ICMLA). IEEE, 2018: 624-628.

Simonyan K, Zisserman A. Very deep convolutional

networks for large-scale image recognition[J]. arXiv

preprint arXiv:1409.1556, 2014.

Suthar M, Mahjoubfar A, Seals K, et al. Diagnostic tool for

pneumothorax[C]//2016 IEEE Photonics Society

Summer Topical Meeting Series (SUM). IEEE, 2016:

218-219.

Szegedy C, Liu W, Jia Y, et al. Going deeper with

convolutions[C]//Proceedings of the IEEE conference

on computer vision and pattern recognition. 2015: 1-9.

Wang H, Gu H, Qin P, et al. CheXLocNet: Automatic

localization of pneumothorax in chest radiographs

using deep convolutional neural networks[J]. PLoS

One, 2020, 15(11): e0242013.

Y. Lecun, L. Bottou, Y. Bengio and P. Haffner, "Gradient-

based learning applied to document recognition," in

Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-

2324, Nov. 1998, doi: 10.1109/5.726791.

Segmentation of Pneumothorax Disease based on Deep Learning

167