Segmentation of Pneumothorax Disease based on Deep Learning
Yao Zhang
North China University of Technology, Beijing, China
Keywords: Pneumothorax Segmentation, Residual Module, Attention Mechanism.
Abstract: Pneumothorax is a common acute pulmonary disease. At present, chest X-ray is an important diagnostic
method of pneumothorax. The image of pneumothorax has the characteristics of uneven distribution, great
changes in the shape and size of lesions, and no obvious characteristics, which makes it difficult for doctors
to make early diagnosis. At the same time, the traditional image algorithm is not good for the extraction of
pneumothorax lesions. To solve the above problems, a deep learning based extraction method for
pneumothorax lesions was proposed. The feature extraction module is constructed by combining the
bottleneck module and improved coordatt attention mechanism, so that the neural network can fully capture
image features, which effectively solves the problem of inaccurate segmentation and extraction due to the
large variation of pneumothorax and the lack of obvious features. Experimental results showed that, on SIIM-
ACR Pneumothorax data set, the Dice index, Accuracy, Recall and Iou reached 85.67%, 92.42%, 87.25% and
81.37%, which proved that compared with other image semantic segmentation methods, Segmentation and
extraction of pneumothorax region results are more accurate.
1 INTRODUCTION
Pneumothorax is a common acute lung disease
(
Gilday 2021), which is fatal. The rapid diagnosis and
treatment of pneumothorax diseases can help to
ensure the safety of patients' lives and have practical
significance. At this stage, the main diagnostic
method for pneumothorax is the doctor's X-ray chest
radiograph. Compared with CT and NMR, X-ray is
inexpensive and has obvious advantages. At present,
the ratio of doctors to patients in China is seriously
imbalanced. Doctors need to diagnose a large number
of chest X-rays every day. The results of artificial
pneumothorax detection are easily affected by factors
such as doctors' experience and level, and are likely
to be missed or misdiagnosed. The failure of
radiologists to detect pneumothorax early is one of
the main causes of death from pneumothorax disease
(
Suthar 2016). Therefore, the Computer-Aided-
Diagnosis (CAD) system (
Chen 2021) should be used
in the automatic detection of clinical X-ray
pneumothorax to help doctors improve the efficiency
and accuracy of diagnosis and reduce missed
diagnosis.
In recent years, with the continuous development
of computer technology, convolutional neural
network models represented by LeNet5(Lecun 1998),
VGG16(Simonyan 2014) and GoogLenet (Szegedy
2015) have been used in the field of computer vision
and medicine. It has achieved success in the image
field, and the recognition effect has been greatly
improved compared with traditional methods. In
2012, Hinton and Krizhevsky used ReLU as the
network activation function, and successfully
proposed Local Response Normalization (LRN), and
AlexNet(Wang 2020), and used the Dropout layer for
the first time to deactivate some neurons and avoid
The model is over-fitting; Kaiming He released the
ResNet(He 2015) neural network based on the
residual module in 2015, which effectively solved the
problem that the gradient disappears when the neural
network reaches a certain depth. In the field of image
segmentation, Ronneberger (Ronneberger 2015) et al.
proposed a U-Net network for medical image
segmentation tasks based on the FCN architecture. It
improved FCN and improved the expansion path a
lot. Multi-channel convolution and similar feature
pyramid networks the structure is combined, and U-
Net can also achieve good results in training and
testing with a small amount of data sets, making a
great contribution to medical image segmentation. In
terms of pneumothorax segmentation, Wang (Wang
2020) et al. proposed a CheXLocNet convolutional
neural network based on Mask R-CNN for
162
Zhang, Y.
Segmentation of Pneumothorax Disease based on Deep Learning.
DOI: 10.5220/0011215500003444
In Proceedings of the 2nd Conference on Artificial Intelligence and Healthcare (CAIH 2021), pages 162-167
ISBN: 978-989-758-594-4
Copyright
c
2022 by SCITEPRESS Science and Technology Publications, Lda. All rights reserved
pneumothorax segmentation. The dice coefficient of
the test set on the SIIM-ACR Pneumothorax dataset
is 82%.
In order to solve the problem of uneven
distribution of pneumothorax image data, large
changes in the shape and size of the lesions,
unobvious features, and inaccurate segmentation of
small lesions, this paper proposes a pneumothorax
segmentation method based on residual module and
attention mechanism. This method adjusts the U-Net
network structure and uses the bottleneck residual
module to effectively extract the features of the
pneumothorax image and perform the semantic
segmentation of the pneumothorax region. At the
same time, the improved CoordAtt module is
embedded in the network to make full use of the
detailed information of the pneumothorax image to
further enhance the effect of pneumothorax
segmentation.
2 METHOD
This article is to realize the pathological segmentation
of pneumothorax disease. Taking into account the
uneven number of positive and negative samples in
the pneumothorax data set, pneumothorax images
have problems such as unclear boundaries, large
changes in shape and size, the final selection is widely
used in medical images and is used in small data. The
U-Net network with good performance on the set is
used as the basic network. The classic U-Net network
is a fully convolutional network segmentation model.
The first half of the network is a feature extraction
module, and the second half is an upsampling
module. This structure is also called an encoder-
decoder structure.
In this chapter, we will introduce the structure of
the improved U-net in detail and explain how the
network is improved. The improved structure is
shown in Figure 1.
This article is to realize the pathological
segmentation of pneumothorax disease. Taking into
account the uneven number of positive and negative
samples in the pneumothorax data set, pneumothorax
images have problems such as unclear boundaries,
large changes in shape and size, the final selection is
widely used in medical images and is used in small
data. The U-Net network with good performance on
the set is used as the basic network. The classic U-Net
network is a fully convolutional network
segmentation model. The first half of the network is
a feature extraction module, and the second half is an
upsampling module. This structure is also called an
encoder-decoder structure.
In this chapter, we will introduce the structure of
the improved U-net in detail and explain how the
network is improved. The improved structure is
shown in Figure 1
Conv 3×3,ReLU
Copy and crop
Up-conv 2×2
Max pool 2×2
Conv1
CoordAtt-Module
N*Bottleneck
N=4
N=3
N=6
N=3
Figure 1: Improved U-Net structure.
The pneumothorax segmentation model proposed
in this paper is based on the residual module and
improved attention mechanism. While continuing the
symmetric structure and jump connection of U-Net,
the bottleneck residual module is embedded to
optimize the segmentation details, and the CoordAtt
module is added to improve the useless features,
perform compression. A total of 5 layers of
symmetrical structure from top to bottom. The
encoder of the first layer contains two 3x3
convolution kernels, and the encoders of the second
to fifth layers contain 3, 4, 6, and 3 bottleneck
residual modules, except for the first layer. Outside
of the first layer, after each layer of encoder
Segmentation of Pneumothorax Disease based on Deep Learning
163
completes the convolution calculation, it is
accompanied by an improved CA module to fully
learn the edge and texture features of the
pneumothorax and improve the network
segmentation performance.
2.1 Bottleneck Residual Module
The U-Net basic network model selected in this study
lost the detailed information in the pneumothorax
image during the encoding-decoding process,
resulting in a decrease in the accuracy of
pneumothorax segmentation. He et al. (He 2015)
proposed a residual connection module in 2015, as
shown in Figure 2.
Figure 2: ResNet structure drawing.
The ResNet module contains two paths, one path
directly adds the input image information to the
bottom layer of the module, and the other path that
contains the feature extraction function is added
together to form a residual short-circuit connection.
Adding Resnet's residual structure to U-Net's network
model can effectively alleviate the loss of details
caused by encoding-decoding, thereby directly
improving accuracy.
2.2 Improved Coordatt Attention
Mechanism
The attention mechanism is derived from the research
of human vision and is widely used in various fields
of deep learning (Krizhevsky 2012, such as image
processing, speech recognition, and natural language
processing. The attention mechanism helps the
convolutional neural network to extract and
recognize objects from complex images by assigning
different weights to different channels of the feature
map, and suppress invalid feature information. In
2021, Hou (Hou 202) et al. proposed CA (CoordAtt),
which embeds location information into channel
attention to help neural networks extract features
more efficiently at a lower cost.
CA uses two 1D global pooling operations to
generate two separate feature perceptions for the
input features along the vertical and horizontal
directions respectively. Then the two feature maps
with embedded specific direction information are
respectively encoded into two attention maps, and
finally both attention maps are applied to the input
feature maps through multiplication. However, CA
only performs global average pooling in the
calculation, and does not pay attention to the detailed
texture information. At the same time, in the process
of feature weighting, the original feature information
cannot be fully utilized. Therefore, this paper
improves the CA attention mechanism. Based on the
original CA, a residual network is added, and the
input features are globally averaged pooled and
maximum pooled, so that it can make fuller use of the
original features. And detailed information, the
structure is shown in Figure 3.
Input
Residual
Conv2d
X Max
Pool
Concat+Conv2d
BatchNorm+Non-linear
X Avg
Pool
Y Max
Pool
Y Avg
Pool
++
Conv2d
Sigmoid Sigmoid
Re-weight
+
Output
Figure 3: Improved CoordAtt Attention Mechanism.
1x1,64
3x3,64
Relu
+
Relu
1x1,256
Relu
CAIH 2021 - Conference on Artificial Intelligence and Healthcare
164
3 EXPERIMENTS
3.1 Data Set and Data Enhancement
The SIIM-ACR Pneumothorax dataset used in this
experiment is provided by Society for Imaging
Informatics in Medicine (SIIM) and American
College of Radiology (ACR), and is open sourced on
the Kaggle platform. The data set contains 12089
Digital Imaging and Communication sin Medicine
(DICOM) files, and the annotations are in RLE
encoding format.
Figure 4: Pneumothorax data set.
3.1.1 Data Set Preprocessing
The experiment first converts the DICOM file into a
512x512 PNG format image, and converts the RLE
encoding format label to a 512x512 label image.
Then, in the original data set, pictures with
pneumothorax accounted for only about 28%, and the
number of positive and negative samples was
seriously unbalanced, which would have a greater
impact on the convergence speed and effect of the
model. Therefore, in training, a sliding sampling
strategy is adopted for the data. Specifically, in the
early stage of model training, a 2:1 large positive and
negative sample ratio is used to randomly sample the
data to make the model converge faster. In the middle
and late stages, a 1:1 sampling ratio is adopted to
make the model more robust.
3.1.2 Image Enhancement
Since there are fewer pneumothorax pictures in the
data set, there is insufficient data for the network
model to learn, which makes the model easy to overfit
during the training process. Therefore, this article
performs data enhancement operations before model
training, adopts the methods of flipping, random
contrast, random gamma, random brightness, random
elastic transformation, random grid distortion, and
visual distortion, and performs data expansion work
to improve the positive sample data. The image
enhancement effect diagram is shown in Figure 5.
Original Fli pping
Random
Constrast
Random Gamma
Random
Brigthness
Random Elastic
Transformation
Random Grid
Distortion
Visual
Distortion
Figure 5: Data enhanced rendering.
3.2 Experimental Details
3.2.1 Experimental Environment
The hardware environment is NVIDIA 1080TI
graphics card, 11G running memory, Intel(R) Core
(TM) i7-7700K processor. The software environment
is Windows 10 system, Python 3.6, Pytorch 1.1
development environment.
3.2.2 Experimental Parameters
The optimizer in training uses the Adam optimizer.
The Adam optimizer has the advantages of fast
calculation and low memory footprint, and can
optimize the model while using a small amount of
computing resources. The learning rate adopts the
CosineAnnealingLR curve that comes with pytorch.
The learning rate change is shown in Figure 6.
Figure 6: CosineAnnealingLR curve.
3.3 Experimental Results
3.3.1 Comparison of Segmentation
Performance of Pneumothorax
After the training is completed, evaluate the
segmentation performance of the algorithm on the
SIIM-ACR Pneumothorax test set. The comparison
experiment results are shown in Table 1.
Segmentation of Pneumothorax Disease based on Deep Learning
165
Table 1: Experimental results of different models.
Model
Dice Precision Recall Iou
U-Net
82.36 89.56 84.51 78.85
CheXLoc
Net
82.82 90.36 84.78 79.17
Albunet
83.25 91.41 86.58 79.92
Ours
85.67 92.42 87.25 81.37
The Dice, Precision, Recall, and Iou of the
algorithm proposed in this paper on the test set are
85.67%, 92.42%, 87.25% and 81.37%, respectively.
It can be seen that the ICA-ResUnet proposed in this
paper is compared with the series of medical image
segmentation previously proposed. Network
performance has been greatly improved. Compared
with the original U-Net algorithm, it has increased by
3.31%, 2.86%, 2.74% and 2.52% respectively
.
Original
Mask
Albunet
Ours
Unet
CheXLocNet
Figure 7: Comparison of segmentation results.
The visualization of the segmentation results of
the four networks is shown in Figure 7. The first
column in the figure is the chest X-ray picture of the
input model, the second column is the real label of the
pneumothorax contour marked by medical experts,
and the last column is the result of the pneumothorax
segmentation in this article. Observing the first and
third lines, the unimproved U-Net network is more
likely to be affected by the inconspicuous features of
the pneumothorax due to insufficient feature
utilization, and there are cases of missed detection
and wrong detection. Albunet (Shvets 2018) and
CheXLocNet have better segmentation effects on
large-area pneumothorax, and the segmented shape
and edge are relatively close to the real label. But by
observing the first row, there are also small-scale
misdetections.
In contrast, the network model proposed in this
paper can effectively segment the pneumothorax
lesions and predict the contour of the pneumothorax
more accurately. Due to the small number of
pneumothorax images and the small area of the lesion
relative to the background area, it is difficult for the
deep learning model to extract features and feature
learning. The network uses the residual module and
attention mechanism to learn in the U-Net network,
strengthens the network's ability to extract
pneumothorax features, and is more suitable for
medical image segmentation tasks.
In summary, compared with other segmentation
algorithms, the segmentation effect of the
pneumothorax segmentation method proposed in this
paper has been significantly improved. It can fully
extract features and use detailed information. It can
be used in the segmentation of pneumothorax with
small area, low image quality, and blurred
boundaries. The process is more robust.
4 CONCLUSIONS
X-ray image segmentation of pneumothorax is a key
step to achieve accurate display, diagnosis, early
treatment and surgical planning of pneumothorax
diseases. This paper proposes a new X-ray
pneumothorax segmentation method. The neural
network model combines the bottleneck module and
the Improved CoordAtt Attention Model. Compared
with other medical image segmentation networks, the
feature extraction ability is greatly improved, so that
the network can effectively detect pneumothorax
lesions area. It performs well in the experiment of the
SIIM-ACR Pneumothorax data set. The above
CAIH 2021 - Conference on Artificial Intelligence and Healthcare
166
method has a certain generalization, not only suitable
for pneumothorax segmentation, but also has
reference value for other medical image segmentation
research.
The next step of research will continue to
optimize the network structure while focusing on
trying other data enhancement methods and the
choice of loss function.
REFERENCES
Chen Zhili, GAO Hao, PAN Yixuan, XING Feng.
Computer-aided diagnosis of breast X-ray image
technology review [J/OL]. Computer engineering and
application: 1-24 [2021-11-08].
http://kns.cnki.net/kcms/detail/11.2127.TP.20211025.
1000.004.html.
Gilday Cassandra and Odunayo Adesola and Hespel Adrien
Maxence. Spontaneous Pneumothorax:
Pathophysiology, Clinical Presentation and
Diagnosis[J]. Topics in Companion Animal Medicine,
2021: 100563-.):
Hou Q, Zhou D, Feng J. Coordinate attention for efficient
mobile network design[C]//Proceedings of the
IEEE/CVF Conference on Computer Vision and
Pattern Recognition. 2021: 13713-13722.
Krizhevsky A, Sutskever I, Hinton G E. Imagenet
classification with deep convolutional neural
networks[J]. Advances in neural information
processing systems, 2012, 25: 1097-1105.
Li Z, Zuo J, Zhang C, et al. Pneumothorax Image
Segmentation and Prediction with UNet++ and MSOF
Strategy[C]//2021 IEEE International Conference on
Consumer Electronics and Computer Engineering
(ICCECE). IEEE, 2021: 710-713.
Ronneberger O, Fischer P, Brox T. U-Net: Convolutional
networks for biomedical image
segmentation[C]//International Conference on Medical
image computing and computer-assisted intervention.
Springer, Cham, 2015: 234-241.
Shvets A A, Rakhlin A, Kalinin A A, et al. Automatic
instrument segmentation in robot-assisted surgery
using deep learning[C]//2018 17th IEEE International
Conference on Machine Learning and Applications
(ICMLA). IEEE, 2018: 624-628.
Simonyan K, Zisserman A. Very deep convolutional
networks for large-scale image recognition[J]. arXiv
preprint arXiv:1409.1556, 2014.
Suthar M, Mahjoubfar A, Seals K, et al. Diagnostic tool for
pneumothorax[C]//2016 IEEE Photonics Society
Summer Topical Meeting Series (SUM). IEEE, 2016:
218-219.
Szegedy C, Liu W, Jia Y, et al. Going deeper with
convolutions[C]//Proceedings of the IEEE conference
on computer vision and pattern recognition. 2015: 1-9.
Wang H, Gu H, Qin P, et al. CheXLocNet: Automatic
localization of pneumothorax in chest radiographs
using deep convolutional neural networks[J]. PLoS
One, 2020, 15(11): e0242013.
Y. Lecun, L. Bottou, Y. Bengio and P. Haffner, "Gradient-
based learning applied to document recognition," in
Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-
2324, Nov. 1998, doi: 10.1109/5.726791.
Segmentation of Pneumothorax Disease based on Deep Learning
167