Deep Analysis of CNN Settings for New Cancer Whole-slide Histological

Images Segmentation: The Case of Small Training Sets

Sonia Mejbri

, Camille Franchet

, Reshma Ismat-Ara

, Josiane Mothe

Pierre Brousset

and Emmanuel Faure

Toulouse Institute of Computer Science Research, Toulouse, France

The University Cancer Institute Toulouse, Oncopole, France

{Franchet.Camille, Brousset.Pierre}@iuct-oncopole.fr

Keywords:

Breast Cancer, Histological Image Analysis, Convolutional Neural Networks, Deep Learning, Semantic

Segmentation.

Abstract:

Accurate analysis and interpretation of stained biopsy images is a crucial step in the cancer diagnostic rou-

tine which is mainly done manually by expert pathologists. The recent progress of digital pathology gives

us a challenging opportunity to automatically process these complex image data in order to retrieve essential

information and to study tissue elements and structures. This paper addresses the task of tissue-level segmen-

tation in intermediate resolution of histopathological breast cancer images. Firstly, we present a new medical

dataset we developed which is composed of hematoxylin and eosin stained whole-slide images wherein all 7

tissues were labeled by hand and validated by expert pathologist. Then, with this unique dataset, we proposed

an automatic end-to-end framework using deep neural network for tissue-level segmentation. Moreover, we

provide a deep analysis of the framework settings that can be used in similar task by the scientiﬁc community.

1 INTRODUCTION

Cancer is still a leading cause of death worldwide.

The detection of breast cancer at an early stage of its

development can help to treat it more easily and pre-

vent the progression of the tumor. It is thus of huge

importance (Torre et al., 2016). When a suspicious le-

sion is detected in the breast during a physical exam-

ination or a mammogram, additional tests are needed

to determine whether it is a cancer or not and, if so,

which kind of cancer it is. During biopsy, pathologists

examine histological structures in order to provide an

accurate diagnosis and several prognostic clues. Prac-

tically, pathologists need not only to observe the en-

tire tissue slide at low magniﬁcation but also to nav-

igate through different resolutions to be able to com-

bine architectural and cytological information in or-

der to produce their medical diagnosis. This process

requires a lot of time and concentration and can be

hampered by some inter and intra-individual variabil-

ity (Loukas, 2013).

Latest technological advances in whole slide

imaging and the availability of considerable computa-

tional power have enabled digitizing pathology slides

at microscopic resolution. This process makes pos-

sible the evaluation of breast cancer stained sections

helped by computer vision. These approaches can

guide some of the diagnostic routine tasks in order

to assist pathologists in the medical decision-making

process. This assistance can reduce the workload of

the experts by saving time, reducing costs and, most

importantly, improving diagnostic (Cruz-Roa et al.;

Janowczyk and Madabhushi, 2016). In the context

of breast cancer, several machine learning algorithms

have been developed and applied to increase the ef-

fectiveness in pathological tasks. For instance, re-

searchers have proposed methods to detect nuclei, mi-

tosis (Janowczyk and Madabhushi, 2016) and lym-

phocytes (Janowczyk and Madabhushi, 2016). These

previous studies show several limitations that we ad-

dress in this work. First, images used in the cited

approaches are only small samples of breast cancer

or Tissue Micro Arrays (TMA) histological images

at full resolution (Beck et al., 2011). Each image

captures only a small sample of the full tumor ex-

tend, which is not representative of the whole slides

images (WSIs) used in routine diagnostic pathology.

This problem has partially been addressed by dis-

tinguishing tumor-patches from non-tumor-patches

(Wang et al., 2016). Another limitation is that related

120

Mejbri, S., Franchet, C., Ismat-Ara, R., Mothe, J., Brousset, P. and Faure, E.

Deep Analysis of CNN Settings for New Cancer Whole-slide Histological Images Segmentation: The Case of Small Training Sets.

DOI: 10.5220/0007406601200128

In Proceedings of the 12th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2019), pages 120-128

ISBN: 978-989-758-353-7

work studies consider two categories of tissue only

(tumor and non tumor) which is not representative of

the complex structure of histological images. A typ-

ical section of solid tumor is a very heterogeneous

structure. Also a single sub-type of breast cancer car-

cinoma which is Invasive carcinoma (IC) (Cruz-Roa

et al.). Previous studies do not take into account the

non-invasive breast cancer type called ”in situ car-

cinoma” despite its frequency (20 to 25% of newly

diagnosed breast cancers). Reporting the presence

of both invasive and/or in situ carcinoma is a chal-

lenging part of a diagnostic pathology workup since

there is a signiﬁcant difference of treatment options

of the disease. ion might be crucial to identify areas

where a full resolution analysis should be performed.

There are very few whole slide breast cancer datasets

with pixel-level annotations. Regarding breast can-

cer pathological dataset Spanhol et al introduced The

Breast Cancer Histopathological Image Classiﬁcation

(BreakHis) wich is composed of 2,480 benign and

5,429 malignant samples of microscopic images of

breast tumor tissue (Spanhol et al.). However, these

two categories of tissues are not enough because it

does not reﬂect the complexity of tissue diversity. To

tackle this shortcoming, Grand Challenge on Breast

Cancer Histology Images (BACH) had launched an

annotated Whole-slide images dataset (Aresta et al.,

2018). The organization provided 10 pixel-wise anno-

tated regions for the benign, in situ and invasive carci-

noma classes present in a entire sampled tissue which

represent a partially annotated masks. In recent years,

deep learning models, especially convolutional neural

networks (CNNs) (LeCun et al.) have emerged as a

new and more powerful model for automatic segmen-

tation of pathological images. The power of a CNN

based model lies in its deep architecture which allows

for learning relevant features at lower levels of ab-

straction. (Hou et al., 2016) proposed a patch-based

CNN and to train a decision fusion model as a two-

level model: patch-based and image-based model to

classify WSIs into tumor subtypes and grades. Chen

et al. proposed an encoder-decoder architecture to

gland segmentation in benign and malignant (Chen

et al., 2016a). Cruz et al. presented a classiﬁcation

approach for detecting presence and extent of inva-

sive breast cancer on WSIs using a ConvNet classiﬁer

(Cruz-Roa et al.).

The greatest challenge in the medical imaging do-

main especially in pathology is to deal with small

datasets and limited amount of annotated samples,

especially when employing supervised convolutional

learning algorithms that require large amounts of la-

beled data for the training process. Previous studies

that investigated the problem of breast cancer patho-

logical images analysis, did not provide a proper

quantitative and qualitative parameters evaluation for

training deep CNN from scratch with few annotated

samples only.

Contributions

The contribution of this paper is two folds: ﬁrst since

there is no publicly available annotated data for this

task we developed a new dataset; second we con-

ducted a set of experiments to evaluated several CNN

architectures and settings on that new type of data.

More precisely, we:

• developped a new dataset of WSIs with different

subtypes of breast cancer. The data set consists in

11 whole-slide images fully annotated.

• proposed a fully automatic framework. We ap-

plied machine learning algorithms to extract the

predictive model, and more precisely, we applied

and adapted a patch-based deep learning approach

on our new dataset. While our model relies on ex-

isting architectures (SegNet (Badrinarayanan and

Kendall, 2017), U-Net (Ronneberger et al.), FCN

(Long et al., 2015) and DeepLab (Chen et al.,

2016b)), the originality of our work resides in a

deep analysis of the parameters of the model.

• conducted several experiments to evaluate the set-

tings of each step of the proposed framework in

order to get the optimal set of parameters when

dealing with this new data for a tissue-level seg-

mentation task.

The paper is organized as follows: in Section 2, we

present the new data set that we built. Section 3

presents the framework we developed as well as an

overview of the experiments and evaluation measures.

Section 4 presents the details of the experiments and

their results. Section 5 provides the main recommen-

dations related to the inﬂuence of the model parame-

ters.Section 6 concludes this paper and discusses fu-

ture work.

2 NEW ANNOTATED DATASET

This work involved anonymized breast cancer slides

from the archives of the pathology department of the

Toulouse University Cancer Institute. The breast can-

cer images waere acquired with a Panoramic Digital

Slide Scanners 3DHISTECH. This selection was re-

viewed by an expert pathologist to conﬁrm the pres-

ence of at least one of the two cited categories of car-

cinoma considered in this study. To describe the com-

plexity of the tissue structures present in the image

Deep Analysis of CNN Settings for New Cancer Whole-slide Histological Images Segmentation: The Case of Small Training Sets

121

Table 1: Tissues categories characteristics and correspond-

ing average area present in the dataset.

Tissue label Avg. area Tissue description

Invasive car-

cinoma (IC)

8.11% (±

7.2%)

carcinoma that

spreads outside the

ducts and invade the

surrounding breast

tissue.

Ductal Carci-

noma In situ

(DCIS)

0.75%

(±1.89%)

carcinoma conﬁned

to the ducts.

Benign

epithelium

1.77%

(±1.9%)

non-malignant

lesions in the tissue.

Simple

stroma

18.57%

(±8.58%)

homogeneous com-

position, includes

tumor stroma and

ﬁbrosis

Complex

stroma

8.57%

(±6.2%)

heterogeneous com-

position, a mixture of

ﬁbrous and adipose

tissue

Adipose tis-

sue

21.5%

(±11.31%)

monotonous tis-

sue, comprised

mostly of adipocytes,

fat-storing cells.

Artifacts 1.15%

(±1.09%)

random noise due to

the staining proce-

dure and folds of tis-

sue slices

Background 43.96%

(±8.53%)

absence of tissue

of breast cancer, the pathologist selected seven rele-

vant types of tissue which are identiﬁed and analyzed

during the biopsy routine of breast cancer pathology

(Table. 1).

To alleviate the burden of manual annotation and

save time and effort for the pathologist to produce

ground truth masks, ﬁrstly the annotation of the whole

images was performed by a non-expert with basic

knowledge of the breast cancer histology. During this

process, super-pixels were created using the multi-

resolution segmentation function provided by the im-

age analysis environment Deﬁniens Developer XD

software, and often, there was manual intervention

to modify the shape of the super-pixels in order to

obtain an annotation as accurate as possible. Then,

each super-pixel was manually labeled with the cor-

responding type of tissue. Afterwards, an expert

pathologist validated and corrected the wrongly clas-

siﬁed tissues to ﬁnally produce the ground truth multi-

class masks (Figure. 1). We obtained 11 whole-

slide images which have been validated by an expert

pathologist. It should be noted that 6 hours are re-

quired to annotate an entire breast cancer slide with

7+Background classes and about 2 hours for valida-

tion, which underlines the tedious and time-costly na-

ture of this task.

Invasive carsinoma Carsinoma In situ Begnin

epithelum Simple stroma Complex stroma

Adipose tissue Background

Figure 1: (A) is an example of whole-slide pathological im-

age (I1) from the dataset and (B) is its respective manual

annotation provided by an expert pathologist.

3 A FRAMEWORK FOR TISSUE

SEGMENTATION

3.1 Overview of the Framework

The breast cancer segmentation approach we devel-

oped adopts an end-to-end convolutional neural net-

work framework (Figure. 2). In this paper, we have

implemented a machine learning workﬂow for multi-

class segmentation applied on new WSI images which

can be divided into several steps:

1. Pre-processing: all images are normalized to re-

duce the color variability within the dataset.

2. Learning: patches are randomly extracted from

each image of the training dataset and injected

into the network adapted for multi-class semantic

segmentation.

3. Prediction and reconstruction: After a close ex-

amination of the networks behaviour, we observed

that the accuracy at the border area is not pre-

cise compared to the central area of the patches.

To overcome this problem, we decided that the

test image is downsampled by sliding windows

with a ﬁxed stride. Then, we reassemble all over-

lapped predicted patches by applying a pixel-wise

BIOIMAGING 2019 - 6th International Conference on Bioimaging

122

argmax over all the classes probabilities to obtain

the whole predicted mask.

4. Evaluation: in order to understand and optimize

each step of the framework, we evaluated the out-

come of the framework using segmentation met-

rics.

In section 4, we re-evaluate each step and their associ-

ated parameters in order to characterize this complex

medical task.

3.2 Network Architectures

Inspired from the work of (Long et al., 2015), many

recent studies have shown the effectiveness of fully

convolutional neural networks FCN for this task.

As one of the most popular pixel-level classiﬁcation

method, the DeepLab models make use of the fully

connected conditional random ﬁelds CRF as a post-

treatment step in their work-ﬂow to reﬁne the seg-

mentation result. Deeplab model overcomes the poor

localization property of deep networks by combin-

ing the responses at the ﬁnal FCN layer with a CRF.

Introducing skip connections has been shown to im-

prove spatial recovery in the decoding features pro-

cess, and assists with gradient ﬂow to decoder path.

The segmentation network SegNet architecture uses

these maxpooling connections to gradually recover

the feature details and size thanks to its symmetrical

architecture. The well-known U-shaped network U-

Net features several steps of downsampling convolu-

tions, followed by upsampling deconvolution layers.

Unlike SegNet, whole feature maps from each down-

sampling layer are passed across the intermediate lay-

ers and concatenated with corresponding upsampling

layers.

3.3 Expriments Setup

Evaluation Metrics. We evaluate the performance

of the evaluated models by measuring the overlap be-

tween automated and manual segmentation. We use

the two following segmentation metrics: the Dice co-

efﬁcient (DC), also called the overlap F1-score, and

the Jaccard index (JI). Global metrics are not always

adequate evaluation measures when class occurrences

are unbalanced, which is the case in most of the medi-

cal applications, since they are biased by the dominant

class(es). To avoid this, the metrics above are usually

evaluated per-class and their result is averaged over

the number of labeled classes.

Training and Implementation Details. All exper-

iments were performed using Keras with tensorﬂow

backend. We used same-padding in convolutional

layers in all evaluated architectures so output chan-

nels have the same dimensions as the input. We

also used rectiﬁed linear units (ReLUs) as activation

function. To reduce the number of parameters and

speed up training, instead of the last fully connected

layer we used a convolutional layer, with the number

of feature maps equal to the number of predicted

classes for the loss function based on the cross

entropy.

In every evaluation, we considered up to 9 im-

ages for the training with a 20% separate validation

split. We used the remaining 2 images to evaluate the

models. We kept these sets of images all along this

study so we could compare our models. Each model

was optimized by Adam (Kingma et al., 2014) for

a pre-determined number of iterations ﬁxed arbitrar-

ily to 10, a batch size of 5 and exponential decaying

learning rate initialized at 1e5. Both classiﬁers were

trained from scratch.

4 PARAMETERS SETTINGS

For each step of the workﬂow, we evaluate the pa-

rameters and answer the challenging questions we en-

counter when we started to deal with the new data (see

Figure 2). For our experiments, our review of the lit-

erature convinced us to explore and evaluate two of

the cited above CNN models : U-Net and SegNet.

4.1 Variability of H&E Stained Images

Is a Normalization Step Necessary? Previous

work (Vahadane et al., 2016; Macenko et al.; Sethi

et al.) has shown that the standardization of colors

brings a clear improvement in the results of image

segmentation and proposed color normalization algo-

rithms that standardize image appearance in order to

minimize variability and undesirable artifacts within

the image. As a pre-processing step, we applied two

of the most used normalizations on pathological data:

Macenko normalization, and Vahadane normaliza-

tion. We chose these approaches because initial em-

pirical results showed Macenko-normalized images

obtained high discrimination between the two sub-

types of cancer classes whereas Vahadane-normalized

images showed high differentiation between the ep-

ithelium and non-epithelium classes. One stained im-

age in our dataset was chosen by an expert according

to the quality of its coloring to be a target image and

we normalized the other images into its color appear-

ance. After evaluation of the prediction, we observed

Deep Analysis of CNN Settings for New Cancer Whole-slide Histological Images Segmentation: The Case of Small Training Sets

123

Figure 2: Workﬂow of the training and test phases of CNN classiﬁers for breast cancer image segmentation.

that both normalizations slightly improve the results,

speciﬁcally the epithelium regions (Table 2).

Does a Large Spectrum of Colors Contribute or

Mislead the Learning Process? Because of the

contrast that appears more strongly in the grayscale,

we wanted to evaluate how well the H&E grayscale

images can improve the performance of our model.

Because gray levels can facilitate the differentiation

of epithelium tissue structures from non-epithelium

structures . However, based on our experiments (Ta-

ble 2), we found that our framework improves the

identiﬁcation of tissues more on raw RGB normalized

images than on grayscale images. The reason could

be that grayscale images miss some relevant informa-

tion that might be helpful for discriminating between

different tissues with similar nuclei distribution, for

example invasive and in situ classes.

What is the Minimum of Necessary H&E Images

to Represent the Diversity of the Characterized

Tissues? In this section, we answer the following

question: what the minimum amount of data to solve

a semantic segmentation problem by training CNN

from scratch is ? This crucial question was not ex-

plored in the recent deep learning based medical im-

age studies and in particular in image pathology pub-

lications. To address this question, we evaluated both,

SegNet and U-Net, by varying the number of train-

ing images and randomly picking images from our

dataset for each run. During the training phase, we

observed two different behaviors: a consistent DC

improvement for SegNet, as the number of training

images increases whereas U-Net seems to converge

Figure 3: Comparison of different number of images on

SegNet and U-Net model during the training phase and the

prediction evaluation.

faster (see Figure. 3). On the opposite, during the pre-

diction evaluation on WSIs, SegNet converges very

fast to the almost optimal result, whereas U-Net needs

at least 6 images to get there. The main conclusion of

this observation which can be applied to any dataset

for probably various domains, reveals the importance

of the chosen model according to the number of in-

puts which is consistent with results of similar work

(Ronneberger et al.). In this study we decided to keep

SegNet as the optimal model as the baseline for other

experiments considering that it gives better results af-

ter a training phase on 9 images.

BIOIMAGING 2019 - 6th International Conference on Bioimaging

124

Table 2: Quantitative comparison of 3 normalization methods applied on two test H&E images: Original (not normalized),

Grayscale, Macenko and Vahadane normaizations. This table represents the pixel-wise evaluation per class and global in

terms of DC and JI.

Tissues

Original Greyscale Macenko Vahadane

JI DC JI DC JI DC JI DC

IC 0.28 0.43 0.26 0.41 0.37 0.55 0.35 0.51

DCIS 0.12 0.10 0.0 0.0 0.0 0.12 0.07 0.1

Begnin epi 0.21 0.34 0.05 0.07 0.20 0.32 0.22 0.33

Stroma 0.76 0.86 0.71 0.83 0.79 0.88 0.79 0.88

Complex stroma 0.33 0.5 0.28 0.43 0.29 0.45 0.32 0.48

Adipose 0.74 0.85 0.74 0.85 0.75 0.86 0.76 0.86

Artifacts 0.25 0.39 0.19 0.33 0.30 0.45 0.28 0.43

Background 0.95 0.97 0.95 0.97 0.96 0.97 0.96 0.97

Global 0.74 0.85 0.73 0.84 0.77 0.87 0.78 0.88

Table 3: Dice Coefﬁcicent(DC) evaluation per tissue for 2 test images (I1 & I2) with 9 training images using four different

segmentation neural networks.

Tissues

U-Net SegNet FCN DeepLab

I1 I2 I1 I2 I1 I2 I1 I2

DC DC DC DC DC DC DC DC

IC 0.72 0.39 0.57 0.43 0.66 0.33 0.65 0.39

DCIS 0.02 0.0 0.13 0.0 0.07 0.0 0.01 0.07

Begnin epi 0.51 0.11 0.53 0.16 0.50 0.10 0.54 0.19

Simple stroma 0.84 0.90 0.85 0.90 0.83 0.89 0.85 0.90

Complex stroma 0.41 0.58 0.40 0.53 0.32 0.57 0.37 0.55

Adipose 0.88 0.81 0.88 0.85 0.87 0.82 0.87 0.82

Artifacts 0.41 0.30 0.42 0.51 0.13 0.24 0.40 0.48

Global 0.86 0.86 0.87 0.88 0.86 0.86 0.86 0.87

4.2 Optimal Tilling for Large Images

How Much Data Augmentation Improves the

Learning Process? In a segmentation learning task,

data augmentation consists of applying various im-

age transformations simultaneously on the raw im-

ages and the validated images. In order to preserve

breast tissue characteristics we avoided transforma-

tions that cause texture deformation (like shearing,

mirroring). H&E images are obviously invariant by

rotation, and thus we ﬁrst considered the rotation

transformation. In this paper, we applied a slightly

different method in this study which consists in simul-

taneously extract and rotate the original sample at ran-

dom angles. This method allows rotation-invariance

and prevents over-ﬁtting of the model. We evaluated

the impact of this rotation augmentation method and

we did not observe any improvement during the train-

ing (DC=0.706 with rotation and DC=0.703 without).

However, when evaluating the 2 test WSIs, the im-

provement using the data augmentation based on ro-

tation is very important (DC=0.876 with rotation and

DC=0.433 without). Secondly, we applied elastic de-

formations (Simard et al., 2003) to the original ex-

tracted training samples. We chose this particular

type of deformation because it seemed to be the most

adequate to represent the natural variation of texture

among the tissues. Due to the large number of patches

=5000) extracted from the raw images, we ob-

served that elastic deformation did not improve the

learning process either during the training phase or

the prediction evaluation. This study conﬁrms the

importance of an appropriate data augmentation ap-

proach, and considering the large dimension of our

WSIs, the overlap of patches extracted from each im-

age combine with rotation is sufﬁcient for data aug-

mentation.

What the Minimum Amount of Labeled Data is?

We evaluated two correlated parameters which are

the size S

and the number N

of randomly extracted

patches per WSI. Regarding the size of patches, we

looked at a large range from 96 to 384 pixels, using

R as a ratio where S

= 96 ∗ R with R ∈ {1, 2, 3, 4}.

Figure. 4 shows the train and prediction DC of our

2 chosen networks. Obviously, both models demon-

strate different behavior when S

and N

vary. Seg-

Net shows a constant increase of training curve be-

cause its architecture includes batch normalization.

But, starting fromN

= 2500, the DC remains con-

Deep Analysis of CNN Settings for New Cancer Whole-slide Histological Images Segmentation: The Case of Small Training Sets

125

Figure 4: The train accuracy over number of training samples per image and sample size of two classiﬁers: (a) U-Net and (b)

SegNet.

stant. A larger random samples could lead to a lot

of redundancy due to overlapping patches. Secondly,

there is a trade-off between localization accuracy and

the use of context. Larger patches require more max-

pooling layers that reduce the localization accuracy,

while small patches allow the network to see more

details but only little context.

5 RESULTS

Among the 11 WSIs in our data set, we choose two

representative WSIs for test (Figure. 1) to evaluate

the ﬁnal prediction performance of our framework.

Table 3 shows global as well as class-wise perfor-

mance on the test images of the four networks pre-

dictions for the 8 classes as presented in Section 3.

Even if the global score of the entire images do not

vary much, SegNet slightly outperforms the other net-

works. Thus, giving the diversity and the number

of tissue categories, it is more interesting to analyze

the classes-wise metrics (Table 3) to capture the dif-

ference between the evaluated models. The class-

wise accuracy clearly shows that larger classes have

reasonable accuracy and smaller classes have lower

accuracy. Epithelium classes and, in particular, the

two carcinoma subtypes are more challenging for the

models to segment than the non-epithelium classes,

many of which occupy a small part of the whole im-

age and appear infrequently as shown in Figure. 4. It

is important to emphasize that U-Net displays better

performance on invasive carcinoma IC where SegNet

was surpassed by 15%, FCN by 8% and DeepLab by

9% respectively in terms of DC score.

Visual Results. Figure 5 shows the visual results of

our framework with optimal setting using the four

models. Even if we examined two test images with

roughly equal DC and JI scores, we obtained differ-

ent segmentation qualities.

6 CONCLUSIONS & DISCUSSION

We proposed an end-to-end framework for a medical

multi-classes segmentation task. We ﬁrst introduced a

dataset of 11 H&E stained breast cancer images cap-

tured at intermediate resolution (20x magniﬁcation).

We annotated WSIs into 7 tissues plus background

categories that an expert pathologist determined im-

portant for the medical task. We proposed a deep

analysis of network settings for image segmentation

in order to determine the optimal conﬁguration that

can be used in similar task. The ﬁnal results was

evaluated using pixel-wise metrics. Results of U-Net,

SegNet, FCN and DeepLab got comparable scores

with DC of 0.86, 0.87, 0.86 and 0.86 respectively.

The current study retains several limitations that we

want to address in future work: Epithelium classes

and artifacts remains a challenge to be detected due

to the huge tissue variability among the WSIs. This

may be improved with larger datasets and class distri-

bution aware labeling training techniques. A reason

for poor performance of carcinoma classes prediction

could lie in the encoder-decoder architecture. More

network architectures that capture the epithelium de-

tails may improve the segmentation performance. A

new metric is necessary to reﬂect the medical infor-

mation since the classical metrics capture the detec-

tion quality without taking into account the impor-

tance of some classes over the others.

REFERENCES

Aresta, G., Ara

ujo, T., and Kwok (2018). Bach: Grand

challenge on breast cancer histology images.

Badrinarayanan, V. and Kendall (2017). Segnet: A deep

convolutional encoder-decoder architecture for image

segmentation. IEEE transactions on pattern analysis

and machine intelligence.

Beck, A. H., Sangoi, A. R., and Leung (2011). Systematic

BIOIMAGING 2019 - 6th International Conference on Bioimaging

126

Ground truth U-Net SegNet FCN DeepLab

Invasive carsinoma Carsinoma In situ Begnin epithelum Simple stroma

Complex stroma Adipose tissue Background

Figure 5: From left to right, two test ground truth masks, U-Net, SegNet, FCN and DeepLab multi-classes segmentation

results using optimal parameters.

analysis of breast cancer morphology uncovers stro-

mal features associated with survival. Science trans-

lational medicine.

Chen, H., Qi, X., Yu, L., and Heng, P.-A. (2016a). Dcan:

Deep contour-aware networks for accurate gland seg-

mentation. In Proceedings of the IEEE conference

on Computer Vision and Pattern Recognition, pages

2487–2496.

Chen, L.-C., Papandreou, G., Kokkinos, I., Murphy, K., and

Yuille, A. L. (2016b). Deeplab: Semantic image seg-

mentation with deep convolutional nets, atrous convo-

lution, and fully connected crfs.

Cruz-Roa, A., Gilmore, H., and et al, B. Accurate and re-

producible invasive breast cancer detection in whole-

slide images: A deep learning approach for quantify-

ing tumor extent. Scientiﬁc reports.

Hou, L., Samaras, D., Kurc, T. M., Gao, Y., Davis, J. E., and

Saltz, J. H. (2016). Patch-based convolutional neural

network for whole slide tissue image classiﬁcation. In

Proceedings of the IEEE Conference on Computer Vi-

sion and Pattern Recognition.

Janowczyk, A. and Madabhushi, A. (2016). Deep learn-

ing for digital pathology image analysis: A compre-

hensive tutorial with selected use cases. Journal of

Pathology Informatics.

Kingma, D. P., Ba, J., et al. (2014). Adam: A method for

stochastic optimization.

LeCun, Y., Bengio, Y., and Hinton, G. Deep learning. na-

ture.

Long, J., Shelhamer, E., and Darrell, T. (2015). Fully con-

volutional networks for semantic segmentation. In

Proceedings of the IEEE conference on computer vi-

sion and pattern recognition, pages 3431–3440.

Loukas, C, K. (2013). Breast cancer characterization based

on image classiﬁcation of tissue sections visualized

under low magniﬁcation. Computational and math-

ematical methods in medicine.

Macenko, M., Niethammer, M., Marron, J. S., et al. A

method for normalizing histology slides for quantita-

tive analysis. In Biomedical Imaging: From Nano to

Macro, 2009. ISBI’09. IEEE International Symposium

on.

Ronneberger, O., Fischer, P., and Brox, T. U-net: Convo-

lutional networks for biomedical image segmentation.

In International Conference on Medical image com-

puting and computer-assisted intervention.

Sethi, A., Sha, L., Vahadane, et al. Empirical comparison

of color normalization methods for epithelial-stromal

classiﬁcation in h and e images. Journal of pathology

informatics.

Simard, P. Y., Steinkraus, D., Platt, J. C., et al. (2003). Best

practices for convolutional neural networks applied to

visual document analysis.

Spanhol, F. A., Oliveira, L. S., Petitjean, C., and Heutte,

L. A dataset for breast cancer histopathological im-

Deep Analysis of CNN Settings for New Cancer Whole-slide Histological Images Segmentation: The Case of Small Training Sets

127

age classiﬁcation. IEEE Transactions on Biomedical

Engineering.

Torre, L. A., Siegel, R. L., Ward, E. M., and Jemal, A.

(2016). Global cancer incidence and mortality rates

and trends—an update. Cancer Epidemiology and

Prevention Biomarkers.

Vahadane, A., Peng, T., Sethi, A., Albarqouni, et al. (2016).

Structure-preserving color normalization and sparse

stain separation for histological images. IEEE trans-

actions on medical imaging.

Wang, D., Khosla, A., Gargeya, R., Irshad, H., and Beck,

A. H. (2016). Deep learning for identifying metastatic

breast cancer. arXiv preprint arXiv:1606.05718.

BIOIMAGING 2019 - 6th International Conference on Bioimaging

128