Improving Mitosis Detection via UNet-Based Adversarial Domain

Homogenizer

Tirupati Saketh Chandra*

, Sahar Almahfouz Nasser*

, Nikhil Cherian Kurian

and Amit Sethi

Electrical Engineering Department, Indian Institute of Technology Bombay, Mumbai, Maharashtra, India

∗

Indicates Equal Contribution

Keywords:

MIDOG, Domain Generalization, Mitosis Detection, Domain Homogenizer, Auto-Encoder.

Abstract:

The effective counting of mitotic ﬁgures in cancer pathology specimen is a critical task for deciding tumor

grade and prognosis. Automated mitosis detection through deep learning-based image analysis often fails

on unseen patient data due to domain shifts in the form of changes in stain appearance, pixel noise, tissue

quality, and magniﬁcation. This paper proposes a domain homogenizer for mitosis detection that attempts to

alleviate domain differences in histology images via adversarial reconstruction of input images. The proposed

homogenizer is based on a U-Net architecture and can effectively reduce domain differences commonly seen

with histology imaging data. We demonstrate our domain homogenizer’s effectiveness by showing a reduction

in domain differences between the preprocessed images. Using this homogenizer with a RetinaNet object

detector, we were able to outperform the baselines of the 2021 MIDOG challenge in terms of average precision

of the detected mitotic ﬁgures.

1 INTRODUCTION

In many practical applications of machine learning

models domain shift occurs after training, wherein

the characteristics of the test data are different from

the training data. Particularly in the application of

deep neural networks (DNNs) to pathology images,

the test data may have different colors, stain con-

centrations, and magniﬁcation compared to what the

DNN was trained on due to changes in scanner, stain-

ing reagents, and sample preparation protocols. MI-

DOG2021 (Aubreville et al., 2021) (organized with

MICCAI 2021) was the ﬁrst challenge that addressed

the problem of domain shift in pathology – in this

case, the scanner – as it is one of the reasons behind

the failure of machine learning models after training,

including those for mitosis detection.

Domain generalization is the set of techniques that

improve the prediction accuracy of machine learning

models on data from new domains without assuming

access to those data during training. Proposing and

https://orcid.org/0000-0002-0325-5821

https://orcid.org/0000-0002-5063-9211

https://orcid.org/0000-0003-1713-0736

https://orcid.org/0000-0002-8634-1804

testing various domain generalization techniques was

the main goal of the MIDOG2021 challenge.

In this paper, we present our work which is an

extension of our proposed method for MIDOG2021

(Almahfouz Nasser et al., 2021). Our contribution is

three-fold and can be summarised as follows. Firstly,

we modiﬁed (Almahfouz Nasser et al., 2021) by shift-

ing the domain classiﬁer from the latent space to the

end of the autoencoder, which improved the results

drastically. Secondly, we showed the importance of

perceptual loss in preserving the semantic informa-

tion, which affects the ﬁnal accuracy of the object de-

tection part. Finally, unlike our previous work, train-

ing the auto-encoder along with the object detection

network end-to-end improved the quality of the ho-

mogenized outputs substantially.

2 RELATED WORK

In this section, we introduce the most signiﬁcant no-

table solutions for the MIDOG20201 challenge be-

fore introducing our proposed method.

In (Wilm et al., 2021) the authors modiﬁed Reti-

naNet network (Algaissi et al., 2020) for mitosis de-

tection by adding a domain classiﬁcation head and a

Chandra, T., Nasser, S., Kurian, N. and Sethi, A.

Improving Mitosis Detection via UNet-Based Adversarial Domain Homogenizer.

DOI: 10.5220/0011629700003414

In Proceedings of the 16th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2023) - Volume 2: BIOIMAGING, pages 52-56

ISBN: 978-989-758-631-6; ISSN: 2184-4305

 2023 by SCITEPRESS – Science and Technology Publications, Lda. Under CC license (CC BY-NC-ND 4.0)

gradient reversal layer to encourage domain agnosti-

cism. In this work, they used a pre-trained Resnet18

for the encoder. For their discriminator, it was a

simple sequence of three convolutional blocks and

a fully connected layer. The domain classiﬁer was

placed at the bottleneck of the encoder. Breen et

al (Breen et al., 2021) proposed a U-Net type archi-

tecture that outputs the probability map of the mi-

totic ﬁgures. These probabilities get converted into

bounding boxes around the mitotic ﬁgures. They used

a neural style transfer (NST) as a domain adapta-

tion technique. This technique casts the style of one

image on the content of another. The method pro-

posed by (Chung et al., 2021) consists of two parts,

a patch selection and a style transfer module. To

learn the styles of images from different scanners,

they used a StarGAN (Choi et al., 2018). A two

steps domain-invariant mitotic detection method was

proposed by (Nateghi and Pourakpour, 2021). This

method is based on Fast RCNN (Girshick, 2015). For

domain generalization purposes they used StainTools

software (Peter Byﬁeld and Gamper, 2022) to aug-

ment the images. StainTools package decomposes the

image into two matrices, a concentration matrix C and

a stain matrix S. By combining the C and S matrices

from different images they produced the augmented

images. A cascaded pipeline of a Mask RCNN (He

et al., 2017) followed by a classiﬁcation ensemble was

proposed by (Fick et al., 2021) to detect mitotic can-

didates. A Cycle GAN (Zhu et al., 2017) was used

to transfer every scanner domain to every other scan-

ner domain. In (Jahanifar et al., 2021) the authors

used a stain normalization method proposed by (Va-

hadane et al., 2016) as a preprocessing step for the

images. Others like (Dexl et al., 2021) merged hard

negative mining with immense data augmentation for

domain generalization was proposed by (Dexl et al.,

2021). Stain normalization techniques such as (Rein-

hard et al., 2001) and (Vahadane et al., 2015) were

used in (Long et al., 2021) to account for the domain

difference between images. Almahfouz Nasser et al.,

(Almahfouz Nasser et al., 2021) proposed an autoen-

coder trained adversarially on the sources of domain

variations. This autoencoder makes the appearance of

images uniform across different domains.

In the rest of the paper, we describe our proposed

method, the data and experiments. Then, we show

qualitative and quantitative results of our method and

conclude with the take-home meesage from this work.

3 METHODOLOGY

3.1 Notations

In domain generalization there are source (seen)

domains, which are shown to the model during

training, and there are target (unseen) domains,

which are used only during testing. Labelled sam-

ples from the source domains are represented by

={(x

, y

)}

i=1

, unlabelled source domains are

represented by D

={(x

)}

i=1

, and labelled target do-

mains are represented by D

lus

={(x

lus

, y

lus

)}

lus

i=1

. Let

the source images from all subsets be represented by

∪ D

3.2 Adversarial End-to-End Trainable

Architecture

Inspired by the work of (Ganin and Lempitsky, 2015),

we have used an encoder-decoder network to trans-

late the patches from different domains (scanners) to a

common space. The translated images are then passed

through RetinaNet for object detection (Algaissi et al.,

2020). The architecture also consists of an adversarial

head with domain classiﬁcation as an auxiliary task.

This head encourages the encoder-decoder network to

erase all the domain-speciﬁc information using a gra-

dient reversal layer. The architecture of our method is

as shown in ﬁgure 1.

Figure 1: The pipeline of our proposed method for mitosis

detection.

3.3 Training Objectives

The object detection loss consists of bounding box

loss (L

) and instance classiﬁcation loss (L

inst

). The

bounding box loss (L

) is computed as smooth L1

loss and the focal loss function (Lin et al., 2017b) is

used for the instance classiﬁcation (L

inst

). The equa-

tion for the focal loss with p

as probability that the

instance belong class k is given by,

FL(p

) = (1 − p

)

log(p

) (1)

Improving Mitosis Detection via UNet-Based Adversarial Domain Homogenizer

In order to ensure that the images translated by

encoder-decoder network contains the semantic infor-

mation a perceptual loss (L

percp

) is used. We have

used the perceptual loss based on pretrained VGG-

16, which is proposed in (Johnson et al., 2016). The

perceptual loss is given by equation 2

( j)

percp

||Φ

( ˆx) − Φ

(x)||

(2)

where x and ˆx are the input and reconstructed im-

ages respectively. Φ is the pretrained frozen VGG net-

work, j is the level of the feature map (zero-indexed)

of size C

× H

× W

obtained from Φ. At the end

of the adversarial head we have used standard cross

entropy loss (L

) for domain classiﬁcation.

The overall loss for the end-to-end training is

given by,

L = E

(x,y)∈D

+ L

inst

] + E

(x,y)∈D

[λ

( j)

percep

+ λ

]

(3)

4 DATA AND EXPERIMENTS

4.1 Dataset

The experiments were conducted on MIDOG 2021

dataset (Aubreville et al., 2021) which consists of

50 whole slide images of breast cancer from four

scanners namely Hamamatsu XR NanoZoomer 2.0,

Hamamatsu S360, Aperio ScanScope CS2, and Le-

ica GT450 forming four domains. Two classes of ob-

jects are to be detected namely mitotic ﬁgures and

hard negatives. The whole slide images from scan-

ners other than the Leica GT450 areis labelled. Small

patches of size 512 x 512 are mined for supervised

end-to-end training such that the cells belonging to at

least one of the mitotic ﬁgures or hard negatives are

present in the patch.

The seen and unseen domains i.e., the scanners are

={Hamamatsu XR NanoZoomer 2.0, the Hama-

matsu S360}, D

={Leica GT450}, D

lus

={Aperio

ScanScope CS2} (refer 3.1 for notations.)

4.2 Implementation Details

The model is implemented using Pytorch (Paszke

et al., 2019) library. For supervised end-to-end train-

ing a batch size of 12 is used with equal number of

patches being included from each scanner. Here the

model is trained using FastAI (Howard et al., 2018)

library default settings with an initial learning rate of

−4

. In the equation 3 we have set the values of hy-

perparameters as j=1, λ

=10 and λ

=25. These values

are chosen by grid search over a range of values. Fur-

ther tuning of these values can yield better results.

Our code is available on Github (Almah-

fouz Nasser et al., ).

4.3 Results

Two classes of objects – hard negatives, and mitotic

ﬁgures – are detected. The models are evaluated on

∪ D

lus

. One of the standard metric for object de-

tection Average precision (AP) at intersection over

union (IoU) threshold of 0.5, which is introduced in

PASCAL VOC challenge (Everingham et al., ), is

used as metric for evaluation. It represents the aver-

age of precision values obtained at various bounding

box conﬁdence thresholds.

End-to-end training with (AEC RetinaNet

+ Pecp) and without using perceptual loss

(AEC RetinaNet) were tried. The results are com-

pared with the reference algorithm DA RetinaNet

(Wilm et al., 2021), RetinaNet (Lin et al., 2017a)

with and without data augmentation. The results

obtained are as shown in table 1.

The count of mitotic ﬁgures is an important clini-

cal goal. So, the performance on the class of mitotic

ﬁgures was our focus. The results in the table 1 show

that the newly designed end-to-end training architec-

tures performs better than the reference algorithm and

the basic RetinaNet based algorithms. The improve-

ment is in terms of detection performance for the class

mitotic ﬁgures,.

Figure 2: The precision vs recall plot, which represents

the values of precision and recall at various IoU thresholds,

shows that the newly designed end-to-end model performs

better than the baselines in terms of recall values without

compromising on the precision.

From the precision-recall plot shown in ﬁgure 2

represents the values of precision recall at various IoU

thresholds. Our method is able to achieve a better re-

BIOIMAGING 2023 - 10th International Conference on Bioimaging

Table 1: Results obtained using end-to-end training of models.

Model AP-Hard Neg AP-Mitotic ﬁgures mAP

RetinaNet 0.196 0.352 0.274

RetinaNet + Aug 0.238 0.619 0.429

DA RetinaNet 0.347 0.655 0.501

AEC RetinaNet 0.289 0.448 0.369

AEC RetinaNet + Pecp 0.248 0.72 0.484

Input Image

Output of Domain

Homogenizer

Input Image

Output of Modified

Domain Homogenizer

Figure 3: A visual comparison of the performances of the

domain homogenizer and the modiﬁed domain homoge-

nizer (proposed method) on a randomly sampled set of

patches. The modiﬁed version is able to transform the im-

ages from various scanners to a common space and the

translated images cannot be visually distinguished on the

basis of scanner whereas the original domain homogenizer

produced the exact same input images.

call without compromising on the precision.

The perceptual loss added at the output of the

decoder helps in retaining the semantic information.

This information which helps in better object detec-

tion. This is also validated by higher AP score ob-

tained when perceptual loss component is added.

As shown in ﬁgure 3 the modiﬁed domain homog-

enizer produced much more plausible images than

the original domain homogenizer (Almahfouz Nasser

et al., 2021).Besides, ﬁgure 4 shows the detection ac-

curacy of our proposed method.

5 CONCLUSIONS

In this paper, we proposed a modiﬁed version of our

previous domain homogenizer proposed by us and

tested it on the data from for the MIDOG 2021 chal-

lenge 2021. We showed that the position of the do-

main classiﬁer has a signiﬁcant impact on the perfor-

mance of the homogenizer. Shifting the adversarial

head from the latent space to the output of the auto-

encoder helps in erasing all the domain-speciﬁc in-

Figure 4: Two examples explaining the results of our pro-

posed method from table 1 i.e., our method is able to detect

the mitotic ﬁgures accurately but not the hard negatives.

formation as the inverse of the domain loss will ﬂow

back throughout the decoder and the encoder unlike

the situation in the previous arrangement. Addition-

ally, our experiments revealed that training the ho-

mogenizer along with the object detection network

end-to-end improves the detection accuracy by a sig-

niﬁcant margin. Finally, we showed that our method

substantially improves upon the baseline of the MI-

DOG challenge in terms of mitotic ﬁgures detection.

REFERENCES

Algaissi, A., Alfaleh, M. A., Hala, S., Abujamel, T. S.,

Alamri, S. S., Almahboub, S. A., Alluhaybi, K. A.,

Hobani, H. I., Alsulaiman, R. M., AlHarbi, R. H.,

et al. (2020). Sars-cov-2 s1 and n-based serological

assays reveal rapid seroconversion and induction of

speciﬁc antibody response in covid-19 patients. Sci-

entiﬁc reports, 10(1):1–10.

Almahfouz Nasser, S., Chandra, T., Kurian, N.,

and Sethi. Improving Mitosis Detection Via

UNet-based Adversarial Domain Homogenizer.

https://github.com/MEDAL-IITB/MIDOG.git.

Almahfouz Nasser, S., Kurian, N. C., and Sethi, A. (2021).

Improving Mitosis Detection via UNet-Based Adversarial Domain Homogenizer

Domain generalisation for mitosis detection exploting

preprocessing homogenizers. In International Con-

ference on Medical Image Computing and Computer-

Assisted Intervention, pages 77–80. Springer.

Aubreville, M., Bertram, C., Veta, M., Klopﬂeisch, R.,

Stathonikos, N., Breininger, K., ter Hoeve, N.,

Ciompi, F., and Maier, A. (2021). Mitosis domain

generalization challenge. In 24th International Con-

ference on Medical Image Computing and Computer

Assisted Intervention (MICCAI 2021), pages 1–15.

Breen, J., Zucker, K., Orsi, N. M., and Ravikumar, N.

(2021). Assessing domain adaptation techniques

for mitosis detection in multi-scanner breast cancer

histopathology images. In International Conference

on Medical Image Computing and Computer-Assisted

Intervention, pages 14–22. Springer.

Choi, Y., Choi, M., Kim, M., Ha, J.-W., Kim, S., and Choo,

J. (2018). Stargan: Uniﬁed generative adversarial net-

works for multi-domain image-to-image translation.

In Proceedings of the IEEE conference on computer

vision and pattern recognition, pages 8789–8797.

Chung, Y., Cho, J., and Park, J. (2021). Domain-robust

mitotic ﬁgure detection with style transfer. In In-

ternational Conference on Medical Image Comput-

ing and Computer-Assisted Intervention, pages 23–

31. Springer.

Dexl, J., Benz, M., Bruns, V., Kuritcyn, P., and Wittenberg,

T. (2021). Mitodet: Simple and robust mitosis de-

tection. In International Conference on Medical Im-

age Computing and Computer-Assisted Intervention,

pages 53–57. Springer.

Everingham, M., Van Gool, L., Williams, C.

K. I., Winn, J., and Zisserman, A. The

PASCAL Visual Object Classes Challenge

2012 (VOC2012) Results. http://www.pascal-

network.org/challenges/VOC/voc2012/workshop/index.html.

Fick, R. H., Moshayedi, A., Roy, G., Dedieu, J., Petit,

S., and Hadj, S. B. (2021). Domain-speciﬁc cycle-

gan augmentation improves domain generalizability

for mitosis detection. In International Conference on

Medical Image Computing and Computer-Assisted In-

tervention, pages 40–47. Springer.

Ganin, Y. and Lempitsky, V. (2015). Unsupervised do-

main adaptation by backpropagation. In International

conference on machine learning, pages 1180–1189.

PMLR.

Girshick, R. (2015). Fast r-cnn. In Proceedings of the IEEE

international conference on computer vision, pages

1440–1448.

He, K., Gkioxari, G., Doll

ar, P., and Girshick, R. (2017).

Mask r-cnn. In Proceedings of the IEEE international

conference on computer vision, pages 2961–2969.

Howard, J., Thomas, R., and Gugger, S. (2018). fastai. Aval-

ableat: https://github. com/fastai/fastai.

Jahanifar, M., Shepard, A., Zamanitajeddin, N., Bashir, R.,

Bilal, M., Khurram, S. A., Minhas, F., and Rajpoot,

N. (2021). Stain-robust mitotic ﬁgure detection for

the mitosis domain generalization challenge. In In-

ternational Conference on Medical Image Comput-

ing and Computer-Assisted Intervention, pages 48–

52. Springer.

Johnson, J., Alahi, A., and Fei-Fei, L. (2016). Perceptual

losses for real-time style transfer and super-resolution.

In European conference on computer vision, pages

694–711. Springer.

Lin, T., Goyal, P., Girshick, R., He, K., and Dollar,

P. (2017a). Focal loss for dense object detection

2017 ieee international conference on computer vision

(iccv).

Lin, T.-Y., Goyal, P., Girshick, R., He, K., and Doll

ar, P.

(2017b). Focal loss for dense object detection. In 2017

IEEE International Conference on Computer Vision

(ICCV), pages 2999–3007.

Long, X., Cheng, Y., Mu, X., Liu, L., and Liu, J.

(2021). Domain adaptive cascade r-cnn for mito-

sis domain generalization (midog) challenge. In In-

ternational Conference on Medical Image Comput-

ing and Computer-Assisted Intervention, pages 73–

76. Springer.

Nateghi, R. and Pourakpour, F. (2021). Two-step domain

adaptation for mitotic cell detection in histopathology

images. In International Conference on Medical Im-

age Computing and Computer-Assisted Intervention,

pages 32–39. Springer.

Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J.,

Chanan, G., Killeen, T., Lin, Z., Gimelshein, N.,

Antiga, L., et al. (2019). Pytorch: An imperative style,

high-performance deep learning library. Advances in

neural information processing systems, 32.

Peter Byﬁeld, a. T. G. and Gamper, J. (Accessed

in 03 Aug 2022). Staintools homepage. In

https://github.com/Peter554/StainTools.

Reinhard, E., Adhikhmin, M., Gooch, B., and Shirley, P.

(2001). Color transfer between images. IEEE Com-

puter graphics and applications, 21(5):34–41.

Vahadane, A., Peng, T., Albarqouni, S., Baust, M., Steiger,

K., Schlitter, A. M., Sethi, A., Esposito, I., and Navab,

N. (2015). Structure-preserved color normalization

for histological images. In 2015 IEEE 12th Inter-

national Symposium on Biomedical Imaging (ISBI),

pages 1012–1015. IEEE.

Vahadane, A., Peng, T., Sethi, A., Albarqouni, S., Wang,

L., Baust, M., Steiger, K., Schlitter, A. M., Esposito,

I., and Navab, N. (2016). Structure-preserving color

normalization and sparse stain separation for histolog-

ical images. IEEE transactions on medical imaging,

35(8):1962–1971.

Wilm, F., Marzahl, C., Breininger, K., and Aubreville, M.

(2021). Domain adversarial retinanet as a reference

algorithm for the mitosis domain generalization chal-

lenge. In International Conference on Medical Im-

age Computing and Computer-Assisted Intervention,

pages 5–13. Springer.

Zhu, J.-Y., Park, T., Isola, P., and Efros, A. A. (2017).

Unpaired image-to-image translation using cycle-

consistent adversarial networks. In Proceedings of

the IEEE international conference on computer vi-

sion, pages 2223–2232.

BIOIMAGING 2023 - 10th International Conference on Bioimaging