Evaluation of the Degree of Malignancy of Lung Nodules in Computed

Tomography Images

L. Gonc¸alves

, J. Novo

, A. Cunha

1,3

and A. Campilho

1,4

INESC TEC - INESC Technology and Science, FEUP Campus, Dr. Roberto Frias, 4200 - 465 Porto, Portugal

University of A Coru

na, Department of Computer Science, Campus de Elvi

na, 15071, A Coru

na, Spain

University of Tr

as-os-Montes and Alto Douro, 5000-801, Vila Real, Portugal

Faculty of Engineering of the University of Porto, FEUP Campus, Dr. Roberto Frias, 4200 - 465, Porto, Portugal

Keywords:

Medical Diagnostic Imaging, Computer-aided Diagnosis, Computed Tomography, Machine Learning, Feature

Extraction.

Abstract:

In lung cancer diagnosis, the design of robust Computer Aided Diagnosis (CAD) systems needs to include

an adequate differentiation of benign from malignant nodules. This paper presents a CAD system for the

classiﬁcation of lung nodules in chest Computed Tomography (CT) scans as the way to diagnose lung cancer.

The proposed method measures a set of 295 heterogeneous characteristics, including morphology, intensity or

texture features, that were used as input of different KNN and SVM classiﬁers.

The system was modeled and trained using a groundtruth provided by specialists taken from a public lung im-

age dataset, the Lung Image Database Consortium and Image Database Resource Initiative (LIDC-IDRI). This

image dataset includes chest CT scans with lung nodule location together with information about the degree of

malignancy, among other properties, provided by multiple expert clinicians. In particular, the computed degree

of malignancy try to follow the manual labeling by the different radiologists. Promising results were obtained

with a ﬁrst order SVM with an exponential kernel achieving an area under the receiver operating characteristic

curve of 96.2 ± 0.5% when compared with the groundtruth provided in the public CT lung image dataset.

1 INTRODUCTION

Nowadays, lung cancer is clearly the world’s deadliest

type of cancer and one of the main causes of deaths

in developed countries. In Uniter States, it represents

approximately the 13% of all new cases of cancer and

the 27% of all cancer deaths. It presents an alarm-

ing 5-year relative survival rate of only 18% (Siegel

et al., 2016). As a reference, the number of deaths

regarding lung cancer is greater than the sum of the

second to fourth cancers regarding deceases, that are,

colon, breast and prostate cancers. Among the main

causes of this situation, we can mention the difﬁculty

in obtaining a good early diagnosis, as the diagnose

is, in most of the cases, difﬁcult due to very small

size and subtlety, and poor contrast of nodules in the

ﬁrst stages of development. Moreover, it is a com-

plex process demanding from the radiologists an ex-

haustive and tedious revision of Computed Tomog-

raphy (CT) scans (Breadsmoore and Screaton, 2003;

van Ginneken, 2008). This complicates possible pre-

ventive screening programs as happens, for example,

with breast cancer. Given the aggressiveness of this

kind of cancer, the possibility of an early diagnosis

represents a crucial factor for the patient’s survival

rate.

From all the possibilities of medical imaging,

chest CT imaging is one of the most used modali-

ties for lung cancer diagnosis as it presents an ad-

equate quality to identify the presence of lung nod-

ules. Computer-aided Diagnosis (CAD) systems can

play an important role in lung cancer diagnosis as they

can assist and facilitate the tedious, although relevant,

work that have to be performed by the radiologists.

The CAD systems is in general terms organized by 5

different progressive stages. In a ﬁrst phase, a prepro-

cessing stage can prepare the images for further anal-

ysis, by normalizing the scale or reduce the noise. A

following stage can be an enhancement step to make

more visible some structures, as lung nodules, to ease

the detection phase. Nodule segmentation is the next

step, to accurately deﬁne nodule boundaries.

Once the nodule is segmented, an adequate nod-

ule characterization is crucial for analysis and cor-

GonÃ

galves L., Novo J., Cunha A. and Campilho A.

Evaluation of the Degree of Malignancy of Lung Nodules in Computed Tomography Images.

DOI: 10.5220/0006116200740080

In Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2017), pages 74-80

ISBN: 978-989-758-227-1

Table 1: Weights for the radiologists classiﬁcation. 1

row, the labels associated to each degree of malignancy. 2

row, p,

the corresponding weight values.

Level of malignancy 1 2 3 4 5

Weight – p 0.15 0.15 0.20 0.25 0.25

responding malignancy differentiation. Measures of

texture, gradient, shape or intensity information (Wu

et al., 2013a; Akram et al., 2015; Liu et al., 2015) are

usually applied in this context. Afterwards, feature

selection can reduce the feature space, removing re-

dundant and irrelevant features. This can be made by

different strategies as, for example, the use of Fisher

criterion and genetic algorithms for this purpose (Liu

et al., 2015). Using this information, different classiﬁ-

cation strategies can be applied to distinguish the nod-

ules regarding its malignancy. Different classiﬁers as

Support Vector Machines (SVM) or Random Forest

(RF), among others, were also used for this purpose

(Kaya and Can, 2015). The best results in the state-

of-the-art for malignancy nodule classiﬁcation were

presented by (Wu et al., 2013b) with a receiver oper-

ating characteristic (ROC) curve (AUC value) value

of 91% by comparing the results with diagnosed nod-

ules by either biopsy, follow-up or surgery.

The objective of this work was the development of

a system that discriminate nodules by its malignancy

using as reference annotated delineations of 4 radiol-

ogists.

2 MATERIALS

The dataset has images from the Lung Image

Database Consortium and Image Database Resource

Initiative (LIDC-IDRI) (Armato et al., 2011). The

LIDC-IDRI is a public image dataset that includes

diagnostic and lung cancer screening thoracic Com-

puted Tomography scans with marked up, annotated

lesions. These images contain a variable number of

slices, where each slice has a resolution of 512 ×512

pixels.

Each one of the scans includes annotations pro-

vided by 4 different radiologists, who have assigned

a degree of malignancy in a scale of 1 to 5 where

1 means “Highly Unlikely for Cancer” and 5 means

“Highly Suspicious for Cancer”. The opinions of the

4 radiologists diverge, and this happens quite often.

For that reason, we selected nodules with a clear iden-

tiﬁcation of the malignancy classiﬁcation, leaving as

indeterminate others with considerable divergence.

As each nodule has an assigned label from 4 ra-

diologists, we combined the 4 labels by a weighted

average of the individual labels. The estimated com-

bined label of each sample, the degree of malignancy,

, of each nodule is deﬁned by:

M =

∑

k=1

dm(k) ∴ β

= p(k) (1)

where p(k) is the weight value for each radiologist

k and dm(k) is the assigned label m by the radiologist

for the nodule.

Table 1 shows the weights that were used for p and

each corresponding level of malignancy, m. In this

case we employed higher values for higher degrees of

malignancy, labels 4 and 5, as we assign higher rel-

evance that a radiologists classify a nodule as highly

suspicious of being malignant than lowers levels, that

is, unlikely suspicious.

Therefore, a complete set of solid nodules that

were annotated by 4 experienced radiologists were

analyzed and used to construct the dataset. Hence,

in the groundtruth we consider those nodules with a

ﬁnal score equal or higher to 4, as malignant45 (la-

bels 4 and 5 correspond to signiﬁcantly suspicious for

cancer), or equal or lower than 2, as malignant12 (un-

likely for cancer). The others are indeterminate.

Finally, the dataset resulted in a total of 177 ma-

lignant45 and 121 malignant12 nodules, in a total of

298 nodules.

3 METHODOLOGY

The process of diagnosis consists of the following

main steps: feature measurement, feature selection,

classiﬁcation and validation. Once the pulmonary

region of interest is delimited (Novo et al., 2014),

the method takes, as input, the segmented nodules

using an approach that was previously developed

(Gonc¸alves et al., 2016). This approach uses the prin-

ciple of central adaptive medialness previously pro-

posed for lung nodule candidate detection task (Novo

et al., 2015) that also demonstrates its robustness in

nodule segmentation.

Regarding feature measurement, a set of 295

features was deﬁned in order to include a large

and diverse set of features that would be able to

capture malignancy nodule properties. The fea-

ture set includes morphological (volume, compact-

ness or sphericity, among others), intensity (differ-

ent statistics) or texture features (Gray-Level Co-

Evaluation of the Degree of Malignancy of Lung Nodules in Computed Tomography Images

Occurrence Matrix (GLCM), Gray-level intensity his-

togram (GLIH) or Gabor ﬁlters, among others.

Feature selection was the next step to identify

the main useful characteristics for the classiﬁcation

avoiding, therefore, irrelevant and redundant features

and facilitating the classiﬁcation process. Feature se-

lection was applied by 10-fold cross validation us-

ing 2 different algorithms: Correlation Feature Se-

lection (CFS) that analyses the strength of a feature

in predicting the class of the object; this approach

tends to give little importance to the inter-correlation

of the features; and Relief F algorithm, that samples

instances randomly and checks the distance between

them and the neighbours that have the same or dif-

ferent classes. A weight function uses the distances

between the features to rank them (Hall, 1999).

Finally, different classiﬁers were evaluated using

the selected features. In particular, three SVM classi-

ﬁers were deﬁned, using an exponential kernel as:

k(x, y) = exp



−

k x − y k



(2)

The SVMs used three values for θ, θ = [1, 2,

3], representing the degree of the exponential kernel.

Three kNN classiﬁers with k = [13, 15, 17] were also

included. The classiﬁcation was performed by 10-

fold cross-validation on the second set with 50 rep-

etitions. The mean and standard deviation were cal-

culated from the 50 values of AUC, using the selected

subset of features.

4 RESULTS AND DISCUSSION

We trained and tested the method using the

groundtruth dataset previously deﬁned.

Regarding feature selection, by CFS algorithm,

the features that appeared in more than 80% of the

folds were selected and ranked according to the num-

ber of times they were selected. For the Relief F,

the highest ranked features were chosen until the se-

lected total was equal to the number of features se-

lected by the CFS. The total number of features se-

lected by both methods was 12. The majority of the

features chosen by both methods were texture fea-

tures, though CFS also selected two geometric fea-

tures and Relief F three intensity features. The CFS

selected a great number of GLCM and Laws features.

CFS also included a volume and compactness charac-

teristics, which is coherent as the radiologists tend to

consider compact and round nodules as unlikely for

cancer, or large and irregular nodules as suspicious

for cancer. Relief F selected different features that

pay attention to the center calciﬁcation of the nodules,

property used for identifying nodule malignancy.

The main performance classiﬁcation results are

shown in Table 2 and Table 3. In general terms, the

AUC values for all the classiﬁers and feature subsets

are satisfactorily high, though there is no considerable

difference between them. Even so, The SVM classi-

ﬁers provide the best results outperforming all KNN

classiﬁers, specially with the CFS subset. From all of

them, the lowest value was obtained with the 13-KNN

and CFS with an AUC of 93,2%. On the other hand,

the SVMs results with CFS subset provided very sim-

ilar results so it does not provide a clear conclusion

on which classiﬁer is the best. However, the third de-

gree SVM is slightly better than the others, provid-

ing equivalent AUCs of 96,4% versus an 96,3% and

96,2% from the second and ﬁrst degree classiﬁers, re-

spectively.

The classiﬁcation results can also be checked in

the ROC curves of Figure 1 and Figure 2, where the

previous evaluation comments remain. These ROC

curves were constructed deﬁning progressive thresh-

olds in the a priori probabilities of the classiﬁers from

0 to 1, with a step of 0.01. Once again, the SVM clas-

siﬁers are better than KNN classiﬁers and their per-

formance is very similar. Although this is true, the

subset from Relief F improves the results of the KNN

and decreases the performance of the SVMs.

Different correct classiﬁcation examples are

shown in ﬁgure 3, with the corresponding degree of

conﬁdence. The degree of conﬁdence is the posterior

probability of a nodule belonging to a particular class

(malignant12 or malignant45), that is, it identiﬁes the

degree of certainty that a classiﬁer has on identifying

a nodule as the corresponding malignant12 or malig-

nant45 class. The left column represents malignant12

nodules and the right column represents the malig-

nant45 ones. In each image the contour of the nodule

is also presented giving a clear idea of their morphol-

ogy, representing the color of the class label (green

for malignant12 or red for malignant45). As observed

in the statistical results, most of the nodules were

satisfactorily classiﬁed. We also include examples

of misclassiﬁcations in Figure 4. In the case of the

ﬁrst misclassiﬁcation (case 72), the nodule presents a

high degree of spiculation and large size, character-

istics that mainly are observed in malignant45 nod-

ules. In the case of the second misclassiﬁcation (case

283) the small nodule has similar characteristics to

malignant12 ones in terms of shape or size, but other

relevant features are similar to malignant45 nodules

which provokes the ﬁnal malignant45 labeling.

VISAPP 2017 - International Conference on Computer Vision Theory and Applications

Figure 1: ROC curves of the classiﬁcation performances of the 6 classiﬁers and the 12 features selected by CFS algorithm.

Figure 2: ROC curves of the classiﬁcation performances of the 6 classiﬁers and the 12 features selected by Relief-F algorithm.

Evaluation of the Degree of Malignancy of Lung Nodules in Computed Tomography Images

Figure 3: Examples of correct malignant12 (1

column) and malignant45 (2

column) nodules class labels. Conﬁdence is

presented as the posterior probability of a nodule belonging to a particular class (malignant12 or malignant45).

5 CONCLUSIONS

This paper presents a system for the diagnosis of lung

cancer by means of lung nodule malignancy classi-

ﬁcation in chest CT scans. The system has a set of

295 characteristics, being selected the most represen-

tative ones. Six KNN and SVM classiﬁers were evalu-

ated. The best performance was achieved by a ﬁrst or-

der SVM with exponential kernel providing an AUC

value of 96.2 ± 0.5%, which are promising results.

In future work, new features can be included and

wrapper based feature selection methods should be

tested, as better results can be achieved by embedding

the classiﬁers in the selection process the classiﬁers.

VISAPP 2017 - International Conference on Computer Vision Theory and Applications

Table 2: Classiﬁcation results presented as the mean and standard deviation of 50 AUC % values, for 12 features selected by

2 model searches and 3 KNNs.

Area under curve, AUC (%) 13-KNN 15-KNN 17-KNN

Correlation Feature Selection, CFS 93.2 ± 0.8 93.5 ± 0.9 94.1 ± 0.7

Relief F 94.7 ± 0.7 94.4 ± 0.8 94.4 ± 0.7

Table 3: Classiﬁcation results presented as the mean and standard deviation of 50 AUC % values, for 12 features selected by

2 model searches and 3 SVMs.

Area under curve, AUC (%) 1-SVM 2-SVM 3-SVM

Correlation Feature Selection, CFS 96.2 ± 0.5 96.3 ± 0.6 96.4 ± 0.5

Relief F 96.0 ± 0.6 96.3 ± 0.6 96.2 ± 0.6

Figure 4: Examples of incorrect malignant12 (1

column) and malignant45 (2

column) nodules class labels. Conﬁdence is

presented as the posterior probability of a nodule belonging to a particular class (malignant12 or malignant45).

Also, other classiﬁers like neural networks should be

implemented for classiﬁcation comparison.

ACKNOWLEDGEMENTS

This work is ﬁnanced by the ERDF – European Re-

gional Development Fund through the Operational

Programme for Competitiveness and Internationalisa-

tion – COMPETE 2020 Programme, and by National

Funds through the Portuguese funding agency, FCT

– Fundac¸

ao para a Ci

encia e a Tecnologia, within

the project with code POCI-01-0145-FEDER-016673

and the grant contract SFRH/BPD/85663/2012 (J.

Novo).

REFERENCES

Akram, S., Javed, M., Hussain, A., Riaz, F., and Akram, M.

(2015). Intensity-based statistical features for classiﬁ-

cation of lungs ct scan nodules using artiﬁcial intelli-

gence techniques. Journal of Experimental and Theo-

retical Artiﬁcial Intelligence, 27:737–751.

Armato, S., McLennan, G., Bidaut, L., McNitt-Gray, M.,

Meyer, C., Reeves, A., Zhao, B., Aberle, D., Hen-

schke, C., Hoffman, E., Kazerooni, E., MacMahon,

H., Beeke, E. V., Yankelevitz, D., Biancardi, A.,

Bland, P., Brown, M., Engelmann, R., Laderach, G.,

Max, D., Pais, R., Qing, D., Roberts, R., Smith, A.,

Starkey, A., Batrah, P., Caligiuri, P., Farooqi, A.,

Gladish, G., Jude, C., Munden, R., Petkovska, I.,

Quint, L., Schwartz, L., Sundaram, B., Dodd, L., Fen-

imore, C., Gur, D., Petrick, N., Freymann, J., Kirby,

J., Hughes, B., Casteele, A., Gupte, S., Sallamm, M.,

Heath, M., Kuhn, M., Dharaiya, E., Burns, R., Fryd,

D., Salganicoff, M., Anand, V., Shreter, U., Vastagh,

S., and Croft, B. (2011). The lung image database

consortium (LIDC) and image database resource ini-

tiative (IDRI): a completed reference database of lung

nodules on CT scans. Medical Physics, 38:915–931.

Breadsmoore, C. J. and Screaton, N. J. (2003). Classiﬁca-

tion, staging and prognosis of lung cancer. European

Journal of Radiology, 45:8–17.

Gonc¸alves, L., Novo, J., and Campilho, A. (2016). Hessian

based approaches for 3D lung nodule segmentation.

Expert Systems with Applications, 61:1–15.

Hall, M. A. (1999). Correlation-based feature selection

for machine learning. PhD thesis, The University of

Waikato.

Kaya, A. and Can, A. (2015). A weighted rule based method

for predicting malignancy of pulmonary nodules by

nodule characteristics. Journal of Biomedical Infor-

matics, 56:69–79.

Liu, X., Ma, L., Song, L., Zhao, Y., Zhao, X., and Zhou,

C. (2015). Recognizing common ct imaging signs of

lung diseases through a new feature selection method

based on ﬁsher criterion and genetic optimization.

IEEE Journal of Biomedical and Health Informatics,

19:635–647.

Novo, J., Gonc¸alves, L., Mendonc¸a, A., and Campilho, A.

(2015). 3D lung nodule candidates detection in mul-

Evaluation of the Degree of Malignancy of Lung Nodules in Computed Tomography Images

tiple scales. IAPR International Conference on Ma-

chine Vision Applications, MVA 2015, pages 5–8.

Novo, J., Rouco, J., Mendonc¸a, A., and Campilho, A.

(2014). Reliable lung segmentation methodology by

including juxtapleural nodules. International Confer-

ence on Image Analysis and Recognition, ICIAR 2014.

Lecture Notes in Computer Science: Image Analysis

and Recognition, 8815:227–235.

Siegel, R. L., , Miller, K. D., and Jemal, A. (2016). Cancer

statistics, 2016. CA: A Cancer Journal for Clinicians,

66:7–30.

van Ginneken, B. (2008). Computer-aided diagnosis in tho-

racic computed tomography. Imaging Decisions MRI,

12:11–22.

Wu, H., Sun, T., Wang, J., Li, X., Wang, W., Huo, D., Lv,

P., He, W., Wang, K., and Guo, X. (2013a). Combina-

tion of radiological and gray level co-occurrence ma-

trix textural features used to distinguish solitary pul-

monary nodules by computed tomography. Society for

Imaging Informatics in Medicine, 26:797–802.

Wu, H., Sun, T., Wang, J., Li, X., Wang, W., Huo, D., Lv,

P., He, W., Wang, K., and Guo, X. (2013b). Combina-

tion of radiological and gray level co-occurrence ma-

trix textural features used to distinguish solitary pul-

monary nodules by computed tomography. Society for

Imaging Informatics in Medicine, 26:797–802.

VISAPP 2017 - International Conference on Computer Vision Theory and Applications