SalienceNet: An Unsupervised Image-to-Image Translation Method for

Nuclei Saliency Enhancement in Microscopy Images

Emmanuel Bouilhol

1,2

, Edgar Lefevre

, Thierno Barry

, Florian Levet

3,4

, Anne Beghin

5,8,9

Virgile Viasnoff

5,6,7

, Xareni Galindo

, R

emi Galland

, Jean-Baptiste Sibarita

and Macha Nikolski

1,2

Universit

e de Bordeaux, CNRS, IBGC, UMR 5095, 33000, Bordeaux, France

Universit

e de Bordeaux, Bordeaux Bioinformatics Center, 33000, Bordeaux, France

University of Bordeaux, CNRS, IINS, UMR 5297, Bordeaux, France

University Bordeaux, CNRS, INSERM, Bordeaux Imaging Center, BIC, UAR 3420, US 4, Bordeaux, France

Mechanobiology Institute, National University of Singapore, Singapore, Singapore

IRL 3639 CNRS, Singapore, Singapore

Department of Biological Sciences, National University of Singapore, Singapore, Singapore

Immunology Translational Research Programme, Yong Loo Lin School of Medicine, National University of Singapore,

Singapore, Singapore

Department of Microbiology and Immunology, National University of Singapore, Singapore, Singapore

Keywords:

Bioimaging, Deep Learning, Microscopy, Image Processing, Nuclei Segmentation.

Abstract:

Automatic segmentation of nuclei in low-light microscopy images remains a difﬁcult task, especially for high-

throughput experiments where the need for automation is strong. Low saliency of nuclei with respect to the

background, variability of their intensity together with low signal-to-noise ratio in these images constitute a

major challenge for mainstream algorithms of nuclei segmentation. In this work we introduce SalienceNet,

an unsupervised deep learning-based method that uses the style transfer properties of cycleGAN to transform

low saliency images into high saliency images, thus enabling accurate segmentation by downstream analysis

methods, and that without need for any parameter tuning. We have acquired a novel dataset of organoid images

with soSPIM, a microscopy technique that enables the acquisition of images in low-light conditions. Our

experiments show that SalienceNet increased the saliency of these images up to the desired level. Moreover,

we evaluated the impact of SalienceNet on segmentation for both Otsu thresholding and StarDist and have

shown that enhancing nuclei with SalienceNet improved segmentation results using Otsu thresholding by

30% and using StarDist by 26% in terms of IOU when compared to segmentation of non-enhanced images.

Together these results show that SalienceNet can be used as a common preprocessing step to automate nuclei

segmentation pipelines for low-light microscopy images.

1 INTRODUCTION

Segmentation of cell nuclei is of particular interest

for a number of applications such as cell detection,

counting or tracking, morphology analysis and quan-

tiﬁcation of molecular expression. Being able to au-

tomatically segment cell nuclei with high precision is

particularly important in the case of high-throughput

microscopy imaging, where it is often the ﬁrst step

for downstream quantitative data analysis workﬂows.

Indeed, the quality of downstream quantitative analy-

ses is heavily dependent on the accuracy of segmenta-

tion, making precise nuclei segmentation essential for

drawing meaningful biological conclusions.

Many solutions have been developed, as exempli-

ﬁed by computational competitions such as reported

in (Caicedo et al., 2019). Among popular classical

image analysis methods used for nuclei segmentation

are thresholding and watershed algorithm (Malpica

et al., 1997) as well as active contour (Li et al., 2007).

Challenges for automatizing this process are due

to a number of image characteristics that can strongly

vary between biological and image acquisition con-

ditions. Among them, we can cite such aspects as

morphological differences between nuclei from dif-

ferent tissues, heterogeneity of intensity and texture,

Bouilhol, E., Lefevre, E., Barry, T., Levet, F., Beghin, A., Viasnoff, V., Galindo, X., Galland, R., Sibarita, J. and Nikolski, M.

SalienceNet: An Unsupervised Image-to-Image Translation Method for Nuclei Saliency Enhancement in Microscopy Images.

DOI: 10.5220/0011623500003414

In Proceedings of the 16th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2023) - Volume 2: BIOIMAGING, pages 41-51

ISBN: 978-989-758-631-6; ISSN: 2184-4305

 2023 by SCITEPRESS – Science and Technology Publications, Lda. Under CC license (CC BY-NC-ND 4.0)

variation in spatial organization such as the presence

of both sparse or dense images with touching nuclei,

as well as imaging artifacts (e.g., low signal-to-noise

ratio or out-of-focus signal) (Zhou et al., 2019). This

results in the necessity to ﬁne-tune numerous parame-

ters between different image acquisitions, or even be-

tween individual images.

Recent deep-learning based tools such as Cellpose

(Stringer et al., 2021) and StardDist (Schmidt et al.,

2018) have greatly reduced the necessity to choose

speciﬁc parameters. However, despite these impor-

tant methodological advances, no single combination

of methods and parameters can be adopted to auto-

matically perform nuclei segmentation in all images,

due to the aforementioned heterogeneity of biological

samples and technical artifacts (Hollandi et al., 2022).

In particular, live-cell imaging represents a stumbling

block for these techniques, since these images are of-

ten acquired with low light levels and thus have very

low SNR and artifacts. Moreover, current supervised

Deep Learning models for nuclei segmentation fol-

low the supervised paradigm and thus require well

annotated datasets, which (i) engenders bias due to

inaccuracy and incompleteness of available segmen-

tation, where nuclei are improperly annotated and un-

evenly distributed across images (He et al., 2020) and

(ii) limits application to datasets with different image

characteristics.

In this paper, instead of focusing on the segmen-

tation itself, we propose to tackle this problem by en-

hancing the nuclei prior to segmentation step, making

the task easy for classical nuclei segmentation tools.

Speciﬁcally, we take advantage of recent advances in

the ﬁeld of unsupervised generative adversarial net-

works, aiming to translate images from the source

domain to the target domain and alleviating the im-

age annotation requirement. For nuclei enhancement

task, the target domain corresponds to images with

highly salient nuclei, where strong signal difference

between the nuclei and background make segmenta-

tion straightforward.

In this work, we introduce SalienceNet, a novel

unsupervised Deep Learning-based approach for nu-

clei saliency enhancement in microscopy images that

does not require image annotation when there is need

to train the network on new data with different char-

acteristics. We showcase how this can be achieved

for translating organoid images acquired with low

light into contrasted output images, by training the

SalienceNet without providing the network with prior

annotation of newly acquired low contrast images.

SalienceNet gives a new twist to automatic nuclei

segmentation by adapting the domain style transfer

framework to this speciﬁc task, and thus does not

require extensive annotation. We trained a ResNet-

based CycleGAN with a custom loss function dedi-

cated to the task of nuclei enhancement, where the in-

tensity of the nuclei and in particular their borders, are

made more salient regardless of the contrast, intensity,

textures, or shapes of the nuclei in the input data. Fur-

thermore, we evaluated the impact of the obtained nu-

clei enhancement on the downstream nuclei segmen-

tation by performing segmentation using conventional

methods on the enhanced images and have shown that

such a pipeline achieves better performance than seg-

menting the nuclei directly on the original images.

We demonstrate here that incorporating SalienceNet

in a standard segmentation pipeline, makes it possible

to avoid the manual parameter ﬁne-tuning steps.

2 RELATED WORK

2.1 Nuclei Segmentation

Nucleus segmentation methods can be partitioned in

two major groups: those that rely on classical image

processing approaches and those that propose Deep

Learning models. For a thorough review, we refer the

reader to (Hollandi et al., 2022).

Image processing pipelines usually contain a num-

ber of ﬁltering and thresholding steps combined, if

needed, with basic morphological operators to differ-

entiate nuclei (Malpica et al., 1997; Li et al., 2007).

A number of such methods are available as plug-

ins of the main biological analyses open-source soft-

ware tools such as Fiji (Schindelin et al., 2012), ICY

(De Chaumont et al., 2012), QuPath (Bankhead et al.,

2017) or CellProﬁler (McQuin et al., 2018). The de-

velopment of classical image processing methods for

nuclei segmentation is still an active ﬁeld. For exam-

ple, in a recently published image processing library,

CLIJ2 (Haase et al., 2020), the authors proposed a nu-

clei segmentation pipeline ”Voronoi Otsu Labeling”

in which they ﬁrst denoise the images with Gaussian

blur, second to separate regions using Vorono

ı tessel-

lation, and to ﬁnally obtain a binary mask by apply-

ing an Otsu thresholding to obtain the segmentation.

However, time-consuming parameter ﬁne-tuning is

required from the user at different steps of such clas-

sical image processing pipelines, making process-

ing large amount of data impractical (Hollandi et al.,

2022).

The need for an automatized solution capable to

segment the nuclei in images with different charac-

teristics, pushed for the adoption of methods based

on Deep Learning. The U-Net architecture (Ron-

neberger et al., 2015) is used as part of recent Deep

BIOIMAGING 2023 - 10th International Conference on Bioimaging

Learning nucleus/cell segmentation methods, such as

Cellpose (Stringer et al., 2021) and StarDist (Schmidt

et al., 2018). Another successful architecture is Mask

R-CNN, that has been recently adapted for nuclei

segmentation by the authors of nucleAIzer (Hollandi

et al., 2020). ImageJ has recently proposed Deep

Learning-based segmentation plugins, and pre-trained

models are available through DeepImageJ (G

omez-de

Mariscal et al., 2021).

The success of the aforementioned Deep Learn-

ing methods for nuclei segmentation is in particular,

due to the use of large and relatively varied train-

ing datasets, with images acquired using different

microscopy modalities. Nevertheless, establishing a

general solution is still an unmet need, especially for

images acquired with novel microscopy techniques,

such as for example the live-cell imaging (Ettinger

and Wittmann, 2014) that reduces the intensity of the

light sources of the microscope to a minimum in or-

der to limit the photo-damage to the cells thus be able

to observe them over long periods of time. Result-

ing images have a reduced signal intensity and low

SNR. Importantly, having both (i) not been part of the

training and evaluation datasets of the aforementioned

methods and (ii) having different characteristics, such

images represent a yet unsolved challenge for nuclei

segmentation.

Moreover, the performance of the existing super-

vised deep-learning methods depends on the amount

of high-quality annotated data available for training.

Despite the large effort that was put to produce pub-

licly available labels for nuclei segmentation, such as

the 2018 Data Science Bowl competition, such data is

often partially or even incorrectly labeled (He et al.,

2020).

2.2 Image Preprocessing

A frequently used approach to overcome the difﬁculty

of segmentation is to preprocess the images to im-

prove their quality. In the case of nuclei, such en-

hancement mainly concerns the contrast between the

nuclei and the background. Most traditional image

enhancement techniques rely on ﬁltering (low pass,

high pass) or on naive noise removal such as Gaus-

sian blur. Other methods are based on normaliza-

tion of image intensity, such as histogram equaliza-

tion or contrast stretching (see for review (Qi et al.,

2021)). However, in the same way as the segmenta-

tion methods themselves, these image preprocessing

techniques lack the generalization ability. For exam-

ple, ﬁltering or signal normalization is not applicable

to images with low SNR, as it cannot distinguish well

enough the signal from the background.

Deep Learning has been also applied at the pre-

processing step, in particular to estimate the trans-

formation function between sets of acquired images

and their enhanced counterparts through supervised

learning. One of the ﬁrst and most successful meth-

ods was introduced with the CARE network (Weigert

et al., 2018), designed to restore ﬂuorescence mi-

croscopy data without the need to generate manual

training data. The authors showed that it is possi-

ble to learn the mapping between low-intensity and

high-intensity image pairs using a U-Net based neu-

ral network. In the case of live-cell imaging, this

makes possible to restore the image quality. How-

ever, two characteristics of this network limit the gen-

eralization capacity of CARE to new types of im-

ages. First, CARE network follows the supervised

training paradigm and thus requires matching pairs of

the same image and the corresponding nuclei masks,

which is time-consuming. Second, CARE comports

5 separately trained networks and uses a disagree-

ment score between the individual network predic-

tions to eliminate unreliable results, which implies

that images with characteristics that strongly differ

from those in the training set will not be well restored

(Weigert et al., 2018).

2.3 Image to Image Translation

Image quality enhancement has also been approached

through image to image translation deep learning

methods. The goal is to transform an image having

a particular style (source style) into a desired target

style. The most efﬁcient models are based on GANs

(Pang et al., 2021; Wang et al., 2020). Authors of

pix2pix (Isola et al., 2017) were the ﬁrst to apply a

GAN-based architecture to perform the image to im-

age translation. It is a fully supervised method that

requires large paired image datasets to train the trans-

lation model that transforms the source images to the

desired target images. In the context of nuclei seg-

mentation, pix2pix has been used by the authors of

nucleAIzer for data augmentation of their training nu-

clei datasets. Specialized image enhancement models

have been since proposed, such as Cycle-CBAM (You

et al., 2019) for retinal image enhancement and UW-

CycleGAN (Du et al., 2021) for underwater image

enhancement, both based on the CycleGAN architec-

ture. Moreover, enhancement of objects of interest

has been proposed by the authors of DE-CycleGAN

(Gao et al., 2021) to enhance the weak targets for the

purpose of accurate vehicle detection.

SalienceNet: An Unsupervised Image-to-Image Translation Method for Nuclei Saliency Enhancement in Microscopy Images

3 PROPOSED METHOD

In this section, we present the SalienceNet nuclei

saliency enhancement network in detail. We ﬁrst

present the network’s architecture, and then we dis-

cuss the custom generator loss function that drives the

saliency enhancement.

3.1 Network Architecture

SalienceNet implements the image style transfer for

nuclei microscopy images with CycleGAN architec-

ture (Zhu et al., 2017) where the network is com-

posed of two Generative Adversarial Network (GAN)

blocks that exchange information during training as

shown in ﬁgure 1.

Let X be the domain of acquired nuclei images

and Y the style domain of images with enhanced

nuclei saliency. Images do not have to be paired.

SalienceNet translates an image from domain X to the

target domain Y by learning a mapping G : X → Y

such that the distribution of images from G(X ) is in-

distinguishable from Y by an adversarial loss.

The architecture is based on the simultaneous

training of two generator models and two discrimi-

nator models (see ﬁgure 1). First, generator G takes

input from the domain X and outputs images for the

target style domain Y , and second, generator F takes

input from the domain Y and generates images for the

domain X. Adversarial discriminator models are used

to drive the training by estimating how well the gener-

ated images ﬁt the domain: D

distinguishes the out-

puts of G(X) from domain Y ; in the same manner, D

distinguishes the outputs of F(Y ) from domain X.

In our model the discriminators D

and D

are

implemented as PatchGAN classiﬁers, composed of

4 convolution blocks (see ﬁgure 2), each containing

a convolution layer, an instance normalization layer

and an activation layer (LeakyReLU). The generators

G and F are implemented as ResNets having the same

structure with 3 down convolutions, followed by 9

residual blocks, before applying 2 transpose convo-

lutions and one last convolution layer with a Tanh ac-

tivation (see ﬁgure 2).

The discriminator is implemented as a PatchGAN

model that outputs a square feature map of values,

each value encoding the probability that the corre-

sponding patch in the input image is real. These val-

ues are further averaged to generate the global likeli-

hood.

Since the mapping G : X → Y is highly under-

constrained, CycleGAN couples it with an inverse

mapping F : Y → X : the output ”fake Y ” from the

X → Y generator is used as input to the Y → X gener-

ator, whose output ”cycle X ” should match the orig-

inal input image X (and vice versa). This is en-

forced though the cycle consistency loss to obtain

F(G(X)) ≈ X and G(F(Y )) ≈ Y .

3.2 Generator Loss Function

An additional generator loss is used to enforce the cy-

cle consistency and to measure the difference between

the generated output ”cycle X” and X as well as be-

tween the ”cycle Y ” and Y . This regularization makes

possible to constrain the generation process to image

translation.

For SalienceNet we deﬁned the generator loss

function as a combination of three terms: (i) the Mean

Squared Error (MSE), (ii) the Mean Gradient Error

(MGE) and (iii) the Mean Structural SIMilarity index

(MSSIM).

The Mean Squared Error (MSE) computes the

mean of the squared differences between true and pre-

dicted values L

MSE

∑

i=1

((x

) − ( ˆx

))

. This term

ensures that the generator does not produce outliers

too far from the target domain. However, MSE used

alone is known to lead to blurring due to the averaging

between possible outputs, which in image-to-image

translation can lead to low-quality blurred results.

In the case of nuclei segmentation, blurring can

yield images where nuclei boundaries are particularly

difﬁcult to accurately segment. To solve this gradient

problem, we added the Mean Gradient Error (Lu and

Chen, 2022) term L

MGE

that measures the differences

in edges of objects between two images, with the aim

to learn sharp edges. It is based on vertical and hori-

zontal Sobel operators (Kanopoulos et al., 1988), G

and G

= Y ∗





−1 −2 −1

0 0 0

1 2 1





= Y ∗





−1 0 1

−2 0 2

−1 0 1





where ∗ is the convolution operator.

These gradients are combined to deﬁne a global

pixel-wise gradient map G =

+ G

. The gra-

dient map for predicted images

G is computed in the

same way. The L

MGE

is the deﬁned as:

MGE

∑

i=1

∑

j=1



G(i, j) −

G(i, j)



Finally, to drive the network to produce images

with a structure similar to the input structure, we

added L

MSSIM

the Mean Structural SIMilarity index

BIOIMAGING 2023 - 10th International Conference on Bioimaging

G: X ⟶ Y

Fake Y

(output)

F: Y ⟶ X

Fake X

(output)

Generator loss Y to X

Generator loss X to Y

X (input)

Cyclic X

Cycle-consistency loss

Y (input)

Cyclic Y

Cycle-consistency loss

Feature

map

Feature

map

Figure 1: Architecture of SalienceNet. The network is composed by 2 GANs, each GAN having a Generator G and a

Discriminator D. The main element of the Generator is a residual network, while the discriminator is a PatchGAN whose

output is a feature map. The generator loss is computed based in this feature map. The inputs to the network are X(input) and

Y (input), and the outputs are Fake X and Fake Y , corresponding to G(F(Y )) and F(G(X)), respectively. Cycle consistency

loss is computed between the original image X and it’s reconstructed image F(G(X)) and between Y and it’s reconstructed

image G(F(Y )).

64 filters, 7x7, s = 1

128 filters, 3x3, s = 2

256 filters, 3x3, s = 2

256 filters, 3x3, s = 1

128 filters, 3x3, s = 2

64 filters, 3x3, s = 2

1 filters, 7x7, s = 1

64 filters, 4x4, s = 2

128 filters, 4x4, s = 2

256 filters, 4x4, s = 2

512 filters, 4x4, s = 2

1 filters, 4x4, s = 2

Generator Discriminator

Convolution block

Residual block

Transpose Convolution

Legend

Residual block

Conv2D, 256, 3x3, s = 1

Instance normalization

Activation (ReLu)

Conv2D, 256, 3x3, s = 1

Instance normalization

Shortcut

F(𝒙)

F(𝒙) + 𝒙

𝒙

Figure 2: Composition of the networks constituting

SalienceNet. Generators embed a residual network com-

posed of 9 residual blocks, each block being itself com-

posed of 2 convolution layers. Convolution blocks are

composed of a convolution layer, an instance normalization

layer and an activation layer which is ReLU for the genera-

tor and LeakyReLU for the discriminator.

(MSSIM) (Wang et al., 2004) as the last term. This

loss function compares two images based on lumi-

nance l, contrast c and structural information s:

l(x,y) =

2µ

+ c

+ µ

+ c

c(x,y) =

2σ

+ c

+ σ

+ c

s(x,y) =

+ c

where µ

and µ

denote the mean intensity for the in-

put and generated image respectively; σ

and σ

are

standard deviation for the original and generated im-

ages; c

and c

are constants used to avoid insta-

bility when the denominators are close to 0.

The mean SSIM can be obtained over the entire

image using a local window as follows:

MSSIM

(x,y) =

∑

i=1

l(x

) · c(x

) · s(x

)

where x and y denote the input and the generated im-

age, respectively, while x

and x

are the images at the

i-th window when the local window slides over the

original and generated images, and M is the number

of local windows in the image.

In the case of SalienceNet the intuition for the

MSSIM loss for the X → Y generator is to enforce the

SalienceNet: An Unsupervised Image-to-Image Translation Method for Nuclei Saliency Enhancement in Microscopy Images

Organoids Synth. low saliency

Source datasets

LocalizationScreen DataBowlSynth. high saliency

Targ et datasets

SalienceNet

Training Training

Organoids

Synthetic

Source datasets enhanced by SalienceNet

Segmentation

(StarDist, OTSU)

Nuclei segmentation

Masks (organoids)

Masks (synthetic)

Figure 3: Training and evaluation of SaliencyNet. 90% of source style datasets and of the experimental target style datasets

were used to train the network. It was then applied to the remaining 10% of the source style datasets to obtain enhanced

images. Nuclei in the images enhanced by SalienceNet were segmented using classical methods with ﬁxed parameters.

luminance enhancement, while for the Y → X gener-

ator to preserve the structure.

The total loss function is deﬁned as the weighted

sum of the three terms:

total

= αL

MSE

+ βL

MGE

+ γL

MSSIM

where α, β and γ are the weights of each correspond-

ing term so that the sum α + β + γ = 1.

4 DATASETS

To train and evaluate our SalienceNet enhancement

method, we have collected different datasets (see ﬁg-

ure 3). First, two experimentally acquired and ex-

pertly segmented datasets (see section 4.2), which

have been previously extensively used for training

segmentation models. Second, we have acquired a

dataset of organoid images with low-light conditions

that speciﬁcally represents the segmentation chal-

lenge that we want to address, as well as generated the

corresponding synthetic datasets (see sections 4.1 and

4.2). These images belong to one of the two styles:

1. Source style with low saliency of nuclei

(organoid and synthetic low saliency datasets),

2. Target style with high saliency of nuclei (two

experimental datasets and the synthetic high

saliency dataset).

4.1 Source Style Datasets

To evaluate whether SalienceNet enables precise nu-

clei segmentation, we acquired a 3D cell culture

dataset with the soSPIM technique. soSPIM is a sin-

gle objective light-sheet microscopy approach capa-

ble of streamlining 3D cell cultures with fast 3D live-

imaging at speeds up to 300 3D cultures per hour

(Galland et al., 2015; Beghin et al., 2022). We se-

lected 11 neuroectoderm organoids exhibiting a wide

variety of shapes and densities. These organoids have

been differentiated from hESCs, ﬁxed at day 8, im-

munostained with DAPI and imaged using soSPIM,

yielding 1056 2D slices. These 2D image slices are

composing the DS

org

dataset.

To augment the source style dataset, in addition to

the experimental 1056 2D image slices, we generated

synthetic images. First, we performed an expert seg-

mentation of nuclei on each individual 2D slice. Sec-

ond, images paired with their annotated masks were

used to train a simple CycleGAN. Finally, this Cy-

cleGAN model was applied to transform randomly

placed elliptical shapes (roughly approximating nu-

clei shapes) into organoid look-alike images. The

elliptical shapes provide “nuclei” masks in a trivial

way. We generated 1500 synthetic low-saliency im-

ages, denoted by DS

synth

4.2 Target Style Datasets

The goal of our network is to learn to transform an

image i into e(i), where saliency (Kim and Varshney,

2006) at nuclei location is enhanced. To provide the

target style dataset for training the SalienceNet net-

work, we have collected two experimental datasets

where the nuclei saliency was already satisfactory for

segmentation by classical pipelines and for which the

nuclei segmentation masks are available. We comple-

BIOIMAGING 2023 - 10th International Conference on Bioimaging

Table 1: Number of images and nuclei in each dataset (DS

column) used for the training and testing of SalienceNet.

DS #Images #Nuclei Style

org

1056 43633 Source

synth

1500 128962 Source

568 20754 Target

551 23121 Target

synth

2000 171915 Target

mented them by a synthetic high-saliency dataset.

In 2018, a Data Science Bowl competition orga-

nized by Kaggle released a dataset for a challenge of

”Identiﬁcation and Segmentation of Nuclei in Cells”

of images acquired under different conditions and of

different cell types and that vary in size, magniﬁca-

tion, and imaging method (brightﬁeld and ﬂuores-

cence). Nuclei masks have been manually created by

specialists and are provided with the dataset. For the

purpose of this paper, only grayscale cell culture im-

ages were kept, yielding the TS

with 551 images.

Experimentally acquired nuclei images from hu-

man cell lines (Chouaib et al., 2020) were used to de-

ﬁne the TS

dataset. It is composed of 568 images

from 57 different acquisition conditions of 32 gene

expression measured in the study for the purpose of

performing a localization screen. Nuclei masks have

been acquired with NucleAIzer.

The synthetic high-saliency dataset, TS

synth

, was

generated following the same procedure as DS

synth

(see section 4.1) with 2000 images, but with enhanced

saliency. Nuclei masks are provided by the input gen-

eration procedure (elliptical shapes).

Taking these 3 datasets together (summarized in

table 1), the target style dataset contains 3119 images.

5 RESULTS

To train the SalienceNet models, we split each of the

two source datasets as well as the two experimental

target datasets (see ﬁgure 3) into train and test subsets,

in 90% and 10% proportions. The test datasets are

denoted DS

org

, DS

synth

, TS

and TS

, respectfully.

We then performed the hyperparameter search by

exploring all possible combinations of α, β, and γ

(weights of the loss components, see section 3.2) with

step of 0.1 in order to estimate which combination of

parameters yielded the best model. This resulted in 42

models, denoted by (α,β, γ) combinations in ﬁgure 5.

Moreover, for comparison purposes we have trained

a vanilla CycleGAN model, without any modiﬁcation

with respect to the original CycleGAN network.

All the 42 SalienceNet models and the vanilla Cy-

cleGAN were applied to the 4 test datasets DS

org

synth

, TS

and TS

to perform saliency enhance-

ment. The original images and their enhanced coun-

terparts were then segmented, without any parameter

tuning, using two widely used segmentation methods:

1. the non-parametric version of the classical seg-

mentation Otsu thresholding method with an

adaptive threshold,

2. StarDist, a deep-learning based segmentation,

with the 2D ﬂuo versatile model as provided by

(Schmidt et al., 2018), without re-training on our

data or supplementary ﬁne-tuning.

The resulting masks were then compared with

ground truth. For this purpose, expert ground truth

annotation was performed on the DS

org

test dataset;

nuclei masks (ground truth) were already available for

the 3 other test datasets (see sections 4.1 and 4.2).

To measure the quality of the resulting segmenta-

tion, we computed the intersection over union (IOU)

for each image to quantify the overlap (in pixel

count) between the segmentation and the ground

truth: IOU =

|S∩G|

|S∪G|

, where S is the mask resulting

from segmentation and G is the ground truth mask.

SalienceNet Enables Accurate Nuclei Segmenta-

tion. First, we determined the most performant

model with respect to our goal of segmenting low

SNR images from live-cell imaging by looking at

the enhancement performance on organoid images.

Figure 5 shows the IOU scores for segmentation by

StarDist of both enhanced (by 42 models) and non-

enhanced images, and that for each image of the ex-

perimental organoid test dataset DS

org

. Individual

model results for Otsu segmentation being very simi-

lar (albeit slightly worse), are not shown in this ﬁgure.

The IOU is shown in ﬁgure 5 for each image with

respect to the ground truth. The geometric mean of

all non-enhanced image’s IOU was 0.49 for StarDist

segmentation and 0.45 for Otsu segmentation, while

the geometric mean for IOU after saliency enhance-

ment ranged from 0.54 to 0.75 for StarDist and from

0.48 to 0.75 for Otsu segmentation. Best results

were obtained for segmentation after enhancement by

the SalienceNet model with α = 0.2, β = 0.2, and

γ = 0.6, with a geometric mean of IOU of 0.75 for

both StarDist and Otsu. We denote this model by M.

The impact of this model M on the quality of the

downstream segmentation was computed as the ratio

of IOU for images enhanced with M over the IOU

of non-enhanced images. Impact values range be-

tween 1.08 and 2.57 for Otsu segmentation and be-

tween 1.09 and 10.73 for StarDist segmentation; no-

tice that the lower bound is > 1 in both cases.

SalienceNet: An Unsupervised Image-to-Image Translation Method for Nuclei Saliency Enhancement in Microscopy Images

Original image StarDist SalienceNet

SalienceNet+

StarDist

Ground truth

Organoids

Zoom In

Synthetic

Zoom In

Synthetic

Figure 4: Examples of segmentation results by StarDist obtained without enhancement and after enhancement by SalienceNet

M model. Sample images come from two low saliency test datasets: DS

org

for the two upper rows, and DS

synth

for the two

lower rows. Rows 2 and 4 show a zoom-in of the respective rows right above.

Stardist

Otsu

Stardist

Otsu

Impact

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

StarDist

CycleGAN

(0.4, 0.2, 0.4)

(0.3, 0.4, 0.3)

(0.2, 0.6, 0.2)

(0.1, 0.8, 0.1)

(0.4, 0.4, 0.2)

(0.3, 0.3, 0.4)

(0.2, 0.2, 0.6)

(0.1, 0.1, 0.8)

(0.4, 0.4, 0.4)

(0.5, 0, 0.5)

(0.4, 0.3, 0.3)

(0.6, 0.2, 0.2)

(0.8, 0.1, 0.1)

(0.9, 0, 0.1)

(0.8, 0, 0.2)

(0.7, 0, 0.3)

(0.6, 0, 0.4)

(0.4, 0, 0.6)

(0.3, 0, 0.7)

(0.2, 0, 0.8)

(0.1, 0, 0.9)

(0, 0, 1)

(0, 0.9, 0.1)

(0, 0.8, 0.2)

(0, 0.6, 0.4)

(0, 0.5, 0.5)

(0, 0.4, 0.6)

(0, 0.3, 0.7)

(0, 0.2, 0.8)

(0, 0.1, 0.9)

(0.8, 0.2, 0)

(0.9, 0.1, 0)

(0.7, 0.3, 0)

(0.6, 0.4, 0)

(0.5, 0.5, 0)

(0.4, 0.6, 0)

(0.3, 0.7, 0)

(0.2, 0.8, 0)

(0.1, 0.9, 0)

(1, 0, 0)

(0, 1, 0)

1.2

1.4

1.6

1.8

2.2

2.4

Figure 5: Heatmap of the IOU values for each image segmentation of DS

org

, after enhancement and without enhancement.

Average IOU values are shown in the two right columns (StarDist and Otsu). First line: IOUs obtained for non-enhanced

images; second line: IOUs after enhancement by the vanilla CycleGAN and segmentation by StarDist, all the other lines:

IOUs for segmentation by StarDist after enhancement by SalienceNet for different α,β and γ values. Bottom-most rows (red

and yellow color-scale) show the impact of M enhancement on segmentation quality as IOU ratio (log scale).

BIOIMAGING 2023 - 10th International Conference on Bioimaging

Table 2: IOU values for segmentation with Otsu or StarDist

of non-enhanced and enhanced with M SalienceNet (SN)

model images in 4 test datasets. All reported values are

the geometric mean, per dataset, of the IOU of individual

images. Top values in each cell correspond to the geometric

mean, bottom values between brackets show the values for

the 0.25 and 0.75 percentiles.

IOU

Otsu

SN+

Otsu

StarDist

SN+

StarDist

org

0.45 0.75 0.49 0.75

[0.38, 0.51] [0.74, 0.79] [0.38, 0.59] [0.74, 0.79]

synth

0.62 0.90 0.62 0.86

[0.61, 0.67] [0.89, 0.90] [0.59, 0.65] [0.84, 0.87]

0.82 0.86 0.90 0.90

[0.80, 0.90] [0.83, 0.91] [0.88, 0.91] [0.88, 0.91]

0.69 0.78 0.83 0.83

[0.63, 0.86] [0.74, 0.87] [0.79, 0.88] [0.79, 0.88]

An illustration of StarDist segmentation results for

non-enhanced images and for those enhanced by M

for low saliency organoid images is provided in ﬁg-

ure 4.

Table 2 shows the IOU values for the best model

M of SalienceNet applied to the 4 test datasets before

and after enhancement by SalienceNet. On one hand,

we observed that SalienceNet indeed improved the

accuracy of nuclei segmentation in low-light source

datasets. On the other hand, this table shows that for

the already salient images, SalienceNet did not de-

grade the quality of segmentation.

Together, these results show that saliency en-

hancement by SalienceNet enables accurate down-

stream nuclei segmentation by widely used methods

without need for parameter tuning.

SalienceNet Enhances Nuclei Saliency. To evalu-

ate whether SalienceNet improved saliency, we com-

puted its indirect measure - the Signal to Noise Ratio

(SNR) as SNR = (m − µ

)/σ , where m is the max-

imum pixel intensity within the nuclei masks in an

image, µ

is the mean value of the background and σ

is the standard deviation of the background. The SNR

was computed for the source style datasets DS

org

and

synth

, and the target style datasets TS

synth

, TS

and

. We also measured the SNR of the test datasets

org

and DS

synth

after enhancement by SalienceNet.

We observed (see ﬁgure 6) that SalienceNet en-

hanced SNR in low-light images of DS

org

and DS

synth

close to the level of SNR of the already salient exper-

imental target style images TS

and TS

, and up to

the SNR level of the synthetic salient dataset TS

synth

org

Synt

synth

org

+ SN

synth

+ SN

Figure 6: Signal to Noise Ratio (SNR) distributions. Box-

plots represent the distribution of SNR for the source style

images DS

org

and DS

synth

and the target style images TS

, TS

synth

. SNR distributions of the test images after

enhancement with SalienceNet M model are shown for the

low-light DS

org

and DS

synth

datasets.

6 SUMMARY

In this work, we introduced SalienceNet, a

CycleGAN-based network speciﬁcally designed

to enhance nuclei’s saliency in low SNR images that

does not require annotation for training on new data.

We used the soSPIM light-sheet microscopy, a

technique that allows to illuminate the biological sam-

ple with little light compared to other methods, to ac-

quire organoid images. The result is that the illumi-

nation and the SNR are lower in these images and the

nuclei are less salient. We used these organoid images

as source style for training our network and further

for testing. To implement SalienceNet we combined

three loss functions with different properties and have

shown that our adaptation of CycleGan improved seg-

mentation results after enhancement relative to both

segmentation of non-enhanced images and of those

enhanced with the vanilla CycleGan.

We compared the segmentation results of widely

used non-parametric Otsu thresholding and StarDist

on both raw images and images enhanced with

SalienceNet of our novel organoid live-cell imaging

dataset. We have shown that using SalienceNet im-

proved the segmentation quality of both classical and

deep learning based nuclei segmentation algorithms

in low SNR nuclei images. It should be noted that

adding the SalienceNet enhancement step prior to nu-

clei segmentation did not degrade the quality of re-

sults for the already salient datasets.

Taken together, these results show that

SalienceNet: An Unsupervised Image-to-Image Translation Method for Nuclei Saliency Enhancement in Microscopy Images

SalienceNet is a useful new step for nuclei seg-

mentation workﬂows.

7 CODE AVAILABILITY

SalienceNet network’s code for training and testing

nuclei enhancement is fully open source and available

on GitHub at https://github.com/cbib/SalienceNet.

Our best pre-trained model M used in this study is

also available from the same GitHub page.

REFERENCES

Bankhead, P., Loughrey, M. B., Fern

andez, J. A., Dom-

browski, Y., McArt, D. G., Dunne, P. D., McQuaid,

S., Gray, R. T., Murray, L. J., Coleman, H. G., et al.

(2017). Qupath: Open source software for digital

pathology image analysis. Scientiﬁc reports, 7(1):1–7.

Beghin, A., Grenci, G., Sahni, G., Guo, S., Rajendiran, H.,

Delaire, T., Mohamad Rafﬁ, S. B., Blanc, D., de Mets,

R., Ong, H. T., et al. (2022). Automated high-speed 3d

imaging of organoid cultures with multi-scale pheno-

typic quantiﬁcation. Nature Methods, 19(7):881–892.

Caicedo, J. C., Goodman, A., Karhohs, K. W., Cimini,

B. A., Ackerman, J., Haghighi, M., Heng, C., Becker,

T., Doan, M., McQuin, C., et al. (2019). Nucleus seg-

mentation across imaging experiments: the 2018 data

science bowl. Nature methods, 16(12):1247–1253.

Chouaib, R., Saﬁeddine, A., Pichon, X., Imbert, A., Kwon,

O. S., Samacoits, A., Traboulsi, A.-M., Robert, M.-C.,

Tsanov, N., Coleno, E., Poser, I., Zimmer, C., Hyman,

A., Le Hir, H., Zibara, K., Peter, M., Mueller, F., Wal-

ter, T., and Bertrand, E. (2020). A dual protein-mrna

localization screen reveals compartmentalized trans-

lation and widespread co-translational rna targeting.

Developmental Cell, 54(6):773–791.e5.

De Chaumont, F., Dallongeville, S., Chenouard, N., Herv

N., Pop, S., Provoost, T., Meas-Yedid, V., Panka-

jakshan, P., Lecomte, T., Le Montagner, Y., et al.

(2012). Icy: an open bioimage informatics platform

for extended reproducible research. Nature methods,

9(7):690.

Du, R., Li, W., Chen, S., Li, C., and Zhang, Y. (2021). Un-

paired underwater image enhancement based on cy-

clegan. Information, 13(1):1.

Ettinger, A. and Wittmann, T. (2014). Fluorescence live cell

imaging. Methods in cell biology, 123:77–94.

Galland, R., Grenci, G., Aravind, A., Viasnoff, V., Studer,

V., and Sibarita, J.-B. (2015). 3d high-and super-

resolution imaging using single-objective spim. Na-

ture methods, 12(7):641–644.

Gao, P., Tian, T., Li, L., Ma, J., and Tian, J. (2021). De-

cyclegan: An object enhancement network for weak

vehicle detection in satellite images. IEEE Journal

of Selected Topics in Applied Earth Observations and

Remote Sensing, 14:3403–3414.

omez-de Mariscal, E., Garc

ıa-L

opez-de Haro, C.,

Ouyang, W., Donati, L., Lundberg, E., Unser, M.,

noz-Barrutia, A., and Sage, D. (2021). Deepim-

agej: A user-friendly environment to run deep learn-

ing models in imagej. Nature Methods, 18(10):1192–

1195.

Haase, R., Royer, L. A., Steinbach, P., Schmidt, D., Di-

brov, A., Schmidt, U., Weigert, M., Maghelli, N.,

Tomancak, P., Jug, F., et al. (2020). Clij: Gpu-

accelerated image processing for everyone. Nature

methods, 17(1):5–6.

He, J., Wang, C., Jiang, D., Li, Z., Liu, Y., and Zhang,

T. (2020). Cyclegan with an improved loss func-

tion for cell detection using partly labeled images.

IEEE Journal of Biomedical and Health Informatics,

24(9):2473–2480.

Hollandi, R., Moshkov, N., Paavolainen, L., Tasnadi, E.,

Piccinini, F., and Horvath, P. (2022). Nucleus seg-

mentation: towards automated solutions. Trends in

Cell Biology.

Hollandi, R., Szkalisity, A., Toth, T., Tasnadi, E., Molnar,

C., Mathe, B., Grexa, I., Molnar, J., Balind, A., Gorbe,

M., et al. (2020). nucleaizer: a parameter-free deep

learning framework for nucleus segmentation using

image style transfer. Cell Systems, 10(5):453–458.

Isola, P., Zhu, J.-Y., Zhou, T., and Efros, A. A. (2017).

Image-to-image translation with conditional adversar-

ial networks. In Proceedings of the IEEE conference

on computer vision and pattern recognition, pages

1125–1134.

Kanopoulos, N., Vasanthavada, N., and Baker, R. L. (1988).

Design of an image edge detection ﬁlter using the

sobel operator. IEEE Journal of solid-state circuits,

23(2):358–367.

Kim, Y. and Varshney, A. (2006). Saliency-guided enhance-

ment for volume visualization. IEEE Transactions

on Visualization and Computer Graphics, 12(5):925–

932.

Li, G., Liu, T., Tarokh, A., Nie, J., Guo, L., Mara, A., Hol-

ley, S., and Wong, S. T. (2007). 3d cell nuclei seg-

mentation based on gradient ﬂow tracking. BMC cell

biology, 8(1):1–10.

Lu, Z. and Chen, Y. (2022). Single image super-resolution

based on a modiﬁed u-net with mixed gradient loss.

signal, image and video processing, 16(5):1143–

1151.

Malpica, N., De Sol

orzano, C. O., Vaquero, J. J., Santos, A.,

Vallcorba, I., Garc

ıa-Sagredo, J. M., and Del Pozo, F.

(1997). Applying watershed algorithms to the seg-

mentation of clustered nuclei. Cytometry: The Jour-

nal of the International Society for Analytical Cytol-

ogy, 28(4):289–297.

McQuin, C., Goodman, A., Chernyshev, V., Kamentsky, L.,

Cimini, B. A., Karhohs, K. W., Doan, M., Ding, L.,

Rafelski, S. M., Thirstrup, D., et al. (2018). Cellpro-

ﬁler 3.0: Next-generation image processing for biol-

ogy. PLoS biology, 16(7):e2005970.

Pang, Y., Lin, J., Qin, T., and Chen, Z. (2021). Image-to-

image translation: Methods and applications. IEEE

Transactions on Multimedia.

BIOIMAGING 2023 - 10th International Conference on Bioimaging

Qi, Y., Yang, Z., Sun, W., Lou, M., Lian, J., Zhao, W., Deng,

X., and Ma, Y. (2021). A comprehensive overview of

image enhancement techniques. Archives of Compu-

tational Methods in Engineering, pages 1–25.

Ronneberger, O., Fischer, P., and Brox, T. (2015). U-net:

Convolutional networks for biomedical image seg-

mentation. In International Conference on Medical

image computing and computer-assisted intervention,

pages 234–241. Springer.

Schindelin, J., Arganda-Carreras, I., Frise, E., Kaynig, V.,

Longair, M., Pietzsch, T., Preibisch, S., Rueden, C.,

Saalfeld, S., Schmid, B., et al. (2012). Fiji: an open-

source platform for biological-image analysis. Nature

methods, 9(7):676–682.

Schmidt, U., Weigert, M., Broaddus, C., and Myers, G.

(2018). Cell detection with star-convex polygons. In

International Conference on Medical Image Comput-

ing and Computer-Assisted Intervention, pages 265–

273. Springer.

Stringer, C., Wang, T., Michaelos, M., and Pachitariu, M.

(2021). Cellpose: a generalist algorithm for cellular

segmentation. Nature methods, 18(1):100–106.

Wang, L., Chen, W., Yang, W., Bi, F., and Yu, F. R. (2020).

A state-of-the-art review on image synthesis with gen-

erative adversarial networks. IEEE Access, 8:63514–

63537.

Wang, Z., Bovik, A. C., Sheikh, H. R., and Simoncelli, E. P.

(2004). Image quality assessment: from error visi-

bility to structural similarity. IEEE transactions on

image processing, 13(4):600–612.

Weigert, M., Schmidt, U., Boothe, T., M

uller, A., Dibrov,

A., Jain, A., Wilhelm, B., Schmidt, D., Broaddus, C.,

Culley, S., et al. (2018). Content-aware image restora-

tion: pushing the limits of ﬂuorescence microscopy.

Nature methods, 15(12):1090–1097.

You, Q., Wan, C., Sun, J., Shen, J., Ye, H., and Yu, Q.

(2019). Fundus image enhancement method based on

cyclegan. In 2019 41st annual international confer-

ence of the IEEE engineering in medicine and biology

society (EMBC), pages 4500–4503. IEEE.

Zhou, Y., Onder, O. F., Dou, Q., Tsougenis, E., Chen, H.,

and Heng, P.-A. (2019). Cia-net: Robust nuclei in-

stance segmentation with contour-aware information

aggregation. In International conference on informa-

tion processing in medical imaging, pages 682–693.

Springer.

Zhu, J.-Y., Park, T., Isola, P., and Efros, A. A. (2017).

Unpaired image-to-image translation using cycle-

consistent adversarial networks. In Proceedings of

the IEEE international conference on computer vi-

sion, pages 2223–2232.

SalienceNet: An Unsupervised Image-to-Image Translation Method for Nuclei Saliency Enhancement in Microscopy Images