3D Nuclei Segmentation by Combining GAN Based Image Synthesis and

Existing 3D Manual Annotations

Xareni Galindo

1,∗

, Thierno Barry

1,∗

, Pauline Guyot

, Charlotte Rivi

ere

2,3,4 a

, R

emi Galland

1 b

and Florian Levet

1,5 c

CNRS, Interdisciplinary Institute for Neuroscience, IINS, UMR 5297, University of Bordeaux, Bordeaux, France

Univ. Lyon, Universit

e Claude Bernard Lyon 1, CNRS, Institut Lumi

ere Mati

ere, UMR 5306, 69622, Villeurbanne, France

Institut Universitaire de France (IUF), France

Institut Convergence PLAsCAN, Centre de Canc

erologie de Lyon, INSERM U1052-CNRS, UMR5286, Univ. Lyon,

Universit

e Claude Bernard Lyon 1, Centre L

eon B

erard, Lyon, France

CNRS, INSERM, Bordeaux Imaging Center, BIC, UAR3420, US 4, University of Bordeaux, Bordeaux, France

ﬂ

Keywords:

Bioimaging, Deep Learning, Miscrocopy, Nuclei, 3D, Image Processing, GAN.

Abstract:

Nuclei segmentation is an important task in cell biology analysis that requires accurate and reliable methods,

especially within complex low signal to noise ratio images with crowded cells populations. In this context,

deep learning-based methods such as Stardist have emerged as the best performing solutions for segmenting

nucleus. Unfortunately, the performances of such methods rely on the availability of vast libraries of ground

truth hand-annotated data-sets, which become especially tedious to create for 3D cell cultures in which nuclei

tend to overlap. In this work, we present a workﬂow to segment nuclei in 3D in such conditions when no

speciﬁc ground truth exists. It combines the use of a robust 2D segmentation method, Stardist 2D, which

have been trained on thousands of already available ground truth datasets, with the generation of pair of 3D

masks and synthetic ﬂuorescence volumes through a conditional GAN. It allows to train a Stardist 3D model

with 3D ground truth masks and synthetic volumes that mimic our ﬂuorescence ones. This strategy allows to

segment 3D data that have no available ground truth, alleviating the need to perform manual annotations, and

improving the results obtained by training Stardist with the original ground truth data.

1 INTRODUCTION

The popularity of 3D cell cultures, such as organoids

or spheroids, has recently exploded due to their abil-

ity to offer valuable models to study human biol-

ogy, far more physiologically relevant than 2D cul-

tures (Jensen and Teng, 2020; Kapalczynska et al.,

2018). Nowadays, automatically acquiring hundreds

of organoids in 3D has become a reality thanks to the

advances in microscopy systems (Beghin et al., 2022).

Life scientists have therefore access to distributions

of nuclei in 3D for a wide diversity of cell types and

growing conditions, a key feature forming the basis of

advanced quantitative analysis of important cell func-

tions. However, 3D cellular cultures inherently dis-

https://orcid.org/0000-0002-5048-5662

https://orcid.org/0000-0001-7117-3281

https://orcid.org/0000-0002-4009-6225

∗

These two authors contributed equally

play a large diversity of nuclei features, shapes and

arrangements. And 3D microscopy of such complex

samples most often leads to images with lower con-

trast and/or signal to noise ratio as compared to 2D

microscopy of single cellular layer, with overlapping

nuclei according to the achieved optical sectioning.

Hence, accurate and automated nuclei segmentation

in these conditions has turned out to be highly com-

plex. The bottleneck has therefore shifted from the

acquisition to the downstream analysis and quantiﬁ-

cation steps.

Many computational solutions have been pro-

posed over the years that uses traditional image pro-

cessing methods to tackle this segmentation prob-

lem (Caicedo et al., 2019; Malpica et al., 1997; Li

et al., 2007), in 2D and 3D. However, they are usually

tailored for a speciﬁc application and do not general-

ize well, resulting in the necessity to adapt their pa-

rameters and ultimately preventing an automatic and

bias-free analysis.

Galindo, X., Barry, T., Guyot, P., Rivière, C., Galland, R. and Levet, F.

3D Nuclei Segmentation by Combining GAN Based Image Synthesis and Existing 3D Manual Annotations.

DOI: 10.5220/0012309200003657

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 17th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2024) - Volume 1, pages 265-272

ISBN: 978-989-758-688-0; ISSN: 2184-4305

265

In parallel, the last few years have witnessed the

rapid emergence of convolutional neural networks

(CNNs) as a method of choice for microscopy image

segmentation, achieving an accuracy unheard of for a

wide range of segmentation tasks. Among those, deep

learning-based nuclei segmentation has been particu-

larly active, with more than one hundred methods be-

ing developed since 2019 (Mougeot et al., 2022).

Supervised approaches, in which paired of im-

ages and masks are used for training, are the cur-

rent state of the art for nuclei segmentation, with

Stardist (Schmidt et al., 2018; Weigert et al., 2020)

and Cellpose (Stringer et al., 2021) having reached

a prominent position. Their adoption has been fa-

cilitated by their integration into several open-source

platforms, in which life scientists are able to directly

use several pre-trained models (Caicedo et al., 2019)

to segment their images. Unfortunately, these pre-

trained models are limited to the segmentation of nu-

clei in 2D, a fact directly related to the lack of avail-

able ground truth (GT) datasets in 3D. Determining

the organization of nuclei in 3D therefore requires life

scientists to annotate themselves their data, a daunting

and time-consuming task which turns out to be much

more challenging than in 2D due to the lower image

quality and more crowded cell populations.

Generating synthetic training datasets has there-

fore emerged as a potential solution to this problem

through the use of conditional Generative Adversar-

ial Networks (cGANs) (Goodfellow et al., 2014; Isola

et al., 2018; Zhu et al., 2017). cGANs learn a map-

ping from an observed source image x and a ran-

dom noise vector z, to a “transformed” target image

y, G : x, z → y. For nuclei segmentation, the use of

such generative networks aims to generate realistic

microscopy images (background, signal to noise ra-

tio, inhomogeneity, . . . ) of nuclei (target style) from

binary masks (source style) by training the cGAN

with paired or unpaired datasets (Baniukiewicz et al.,

2019; Fu et al., 2018; Wang et al., 2022). Never-

theless, GANs are notoriously known to be difﬁcult

to train, with a large number of hyperparameters re-

quired to be tuned.

In this work, we present a workﬂow to segment

nuclei in 3D when no speciﬁc GT exists by leverag-

ing on cGANs to generate ﬂuorescence synthetic vol-

umes of nuclei from 3D masks. While life scientists

start to be accustomed to using packaged deep learn-

ing methods for segmentation or classiﬁcation, image

synthesis is still far from being accessible to regu-

lar users. We therefore made the choice to specif-

ically design our workﬂow to be accessible to non-

expert life-scientists, leveraging on already packaged

methods, both for the segmentation and the image

Binary mask

SpCycleGAN

s = 1024*1024

size = 512*512, crop = 128*128

size = 512*512, crop = 256*256

Figure 1: SpCycleGAN fails to generate realistic mi-

croscopy images even with different scaling and cropping.

generation, with the aim to avoid any tedious hand-

annotating steps. With that in mind, our workﬂow

ﬁrst relies on the segmentation of the individualized

2D planes of our acquired volumes using the already

pre-trained, well established and now robust 2D seg-

mentation model Stardist 2D db2018 (Schmidt et al.,

2018). After modiﬁcation of the instance masks by

applying a distance transform and a Gaussian ﬁlter,

we pair these transformed ‘GT’ masks with the ﬂu-

orescent nuclei images to train a cGAN speciﬁcally

designed for biological images, that do not require

complex hyper-parameters tuning or speciﬁc crop-

ping data selection (Han et al., 2020). It allows to

generate 3D volumes of nuclei that resemble the ﬂu-

orescent volumes we acquired, from existing 3D GT

masks that were typically created for a different cell

type or microscopy modality. Finally, those pairs of

GT mask volumes and synthetic ﬂuorescent volumes

of nuclei are used to train a Stardist model in 3D ded-

icated to our 3D samples type and imaging modality.

Through this workﬂow, we demonstrate that us-

ing existing works to their full extent can (i) cir-

cumvent the requirement of tedious hand-annotating

steps for the segmentation of nuclei in 3D in com-

plex and crowded environment, and (ii) alleviate the

need to develop new deep learning architectures in an

BIOIMAGING 2024 - 11th International Conference on Bioimaging

266

already crowded ﬁeld (Mougeot et al., 2022), while

also facilitating their usage by life scientists. We

applied this workﬂow on a couple of cell lines and

microscopy modalities and show that we managed

to have good qualitative segmentation results with-

out having to speciﬁcally hand annotate new volumes.

This approach greatly widens the scope of use of 3D

segmentation methods in the rapidly growing and di-

versifying ﬁeld of 3D cell culture analysis, where

many imaging modalities are explored and cell types

with their own morphology characteristics used.

2 MOTIVATION

We recently acquired several oncospheres in 3D of

the colorectal cancer cell line HCT-116 expressing

the nucleus ﬂuorescent label Fucci with a confocal

microscope, denoted as I

, that we aim to segment

while not having any GT mask for that cell culture

type and imaging modality. A couple of years ago

we trained a Stardist 3D model (Beghin et al., 2022),

Stardist

pre

, by creating a GT dataset composed of 7

volumes that we manually annotated and that we will

denote as I

and M

for the images and masks, re-

spectively. These 7 volumes were composed of dif-

ferent 3D cell culture types (oncospheres and neu-

roectoderms), nuclear staining (DAPI and SOX) and

z-steps (500 nm and 1 µm). In addition, we acquired

them with the soSPIM imaging technology (Galland

et al., 2015), a single-objective light-sheet microscope

different from the confocal imaging modality used

to acquire the new oncospheres we aimed to seg-

ment. With this Stardist

pre

model, we managed at

that time to automatically segment more than one

hundred oncospheres and neuroectoderms acquired

with the soSPIM imaging modality (Beghin et al.,

2022). However, applying Stardist

pre

to I

gave poor

results, as it only managed to identify a portion of the

nuclei with an overall over-segmentation. This was

most probably due to the different cell morphology

and imaging modality of I

as compared to the GT

datasets used to train the model.

We therefore tried to generate synthetic volume

having the same style than I

with SpCycleGAN (Fu

et al., 2018), with the ﬁnal objective of being able to

train a new Stardist 3D model more adapted to our

data. SpCycleGAN allows to generate synthetic vol-

umes from binary masks without GT by being built on

top of the unpaired image-to-image translation model

CycleGAN (Zhu et al., 2017). Being unpaired, the

network can use I

as target style without requir-

ing the corresponding nuclei segmentation as source.

Originally, the authors of (Fu et al., 2018; Wu et al.,

Figure 2: Nuclei segmentation with the 2D db2018 pre-

trained Stardist model results in sufﬁciently good results to

train a conditional GAN, scale bar = 20 µm.

2023) generated 3D nuclei masks by ﬁlling volumes

with binarized (deformed) ellipsoids even though it

could be limiting as it may not faithfully represent

the nuclei morphology. Having already a GT nuclei

masks, we therefore decided to train SpCycleGAN

with M

and I

to avoid this pitfall.

In our hand, SpCycleGAN failed to generate re-

alistic microscopy images (Fig. 1). Scrutinizing the

toy datasets available with the network, we tested dif-

ferent scaling and cropping of the data, as well as re-

moving any crops having void regions during train-

ing. Unfortunately, none of our tests were conclu-

sive. We hypothesize that this may be related to two

reasons. First, volumes composing I

exhibit a high

level of noise and an overall crowded and overlapping

nuclei population. SpCycleGAN may fail to transfer

all these features from binary masks. Second, SpCy-

cleGAN may require parameters ﬁne-tuning to work

properly, but this goes against our objective of having

a more generic workﬂow that could be used by non-

experts.

3 METHOD

The proposed method is based upon the use of well es-

tablished and more robust deep-learning methods for

nuclei segmentation (Stardist (Schmidt et al., 2018;

Weigert et al., 2020)) and image synthesis (condi-

tional GANs in a slightly modiﬁed version (Han et al.,

2020)). Ours choices are motivated by the fact that

we wanted an as-simple-as-possible backbone. In this

manner, we intend to show that a simple approach fo-

cusing on data manipulation combined with existing

networks can perform qualitatively well. In addition,

we propose a workﬂow that could become generic

to enable the segmentation of a large variety of 3D

biological samples acquired upon different imaging

modalities.

3D Nuclei Segmentation by Combining GAN Based Image Synthesis and Existing 3D Manual Annotations

267

Figure 3: Manipulating instance masks to generate the FC-

GAN’s source style, scale bar = 5 µm. (a) After applying a

distance transform on the binarized instance masks, the FC-

GAN generates plausible images but with visible artefacts

related to the rough gradient of the transform (red arrows).

(b) Applying a Gaussian ﬁlter after the mask binarization

helps the FCGAN to learn a proper mapping between the

source and target style.

3.1 Fully-Conditional GAN

Contrary to natural images in which every local re-

gion contains relevant information, biological images

are composed of a mix of informative and void re-

gions. They are also inherently multiscale with both

the large-scale spatial organization of cells and their

individual morphology and texture being essential.

Since GANs have been originally developed for nat-

ural images, they tend to struggle to capture these in-

tertwined features and to fail to generate realistic void

regions.

We therefore based our synthesis method in the

fully-conditional GAN (FCGAN) architecture (Han

et al., 2020), an improvement of GAN focused on

modiﬁcations that allow to synthesize multi-scale bio-

logical images. This architecture consists of a genera-

tor built upon the cascaded reﬁnement network (Chen

and Koltun, 2017) instead of the more traditional U-

Net (Ronneberger et al., 2015; Isola et al., 2018) ar-

chitecture as it is less prone to mode collapse. They

also added two modiﬁcations to the traditional GAN

architecture. First, they modiﬁed the input noise vec-

tor z to a noise ”image”, i.e. a 3D tensor with the

ﬁrst two dimensions corresponding to the spatial po-

sitions. Instead of a noise vector that limits the size of

output images, modifying the noise image size allows

to output synthetized images of arbitrary sizes. Sec-

ond, they used a multi-scale discriminator. Having

a ﬁxed size discriminator limits the quality and co-

herency of synthetized images to a micro (object) or

macro (global organization) level. Using a multi-scale

discriminator ensure the generator to produce images

both globally and locally accurate, a desired feature

Figure 4: Comparison of experimental confocal images

(scale bar = 20 µm) and synthetic images generated with

the FCGAN style transfer (scale bar = 10 µm).

when dealing with biological images that exhibit sev-

eral levels of organization. All these modiﬁcations

allow to directly train the FCGAN with the acquired

images, alleviating the need to ﬁne-tune the parame-

ters and resize or crop the void regions.

3.2 Image Synthesis Based 3D Stardist

Training

In spite of its numerous advantages, FCGAN requires

however paired images for training, therefore com-

pelling us to provide corresponding mask and im-

age pairs. Fortunately, 2D pre-trained models of

well-established nuclei segmentation methods such as

Stardist (Schmidt et al., 2018) or Cellpose (Stringer

et al., 2021) have been trained with thousands of 2D

GT masks and achieve now a high degree of robust-

ness over a large variety of sample types. Conse-

quently, we can directly determine the nuclei spa-

tial distribution by segmenting each of the 2D frame

(1024 ∗ 1024) of the volumes of I

with the 2D pre-

trained db2018 Stardist

model (Fig. 2) instead of

simulating them. These 2D segmentation will be de-

noted M

For image generation through the FCGAN model,

binary and instance masks can lead to fuzzy results

because their gradient can be too rough during back-

propagation (Long et al., 2021) (Fig. 3(a)). We

therefore applied a distance transform to our instance

masks (Long et al., 2021) followed by an intensity

normalization and a Gaussian ﬁlter, which allows to

remove intensity stiff jumps and dependency to the

nuclei length, ultimately facilitating the style trans-

fer performed by the FCGAN model (Fig. 3(b)). Ap-

plying this process to each mask image of M

, we

obtained a set of transformed mask images dt(M

Consequently, pairing dt(M

) (source style) with the

corresponding images from I

(target style), allowed

BIOIMAGING 2024 - 11th International Conference on Bioimaging

268

(z = 50)

(y = 302)

(x = 512)

(z = 50)

(y = 302)

(x = 512)

(z = 50)

(y = 302)

(x = 512)

(z = 50)

(y = 302)

(x = 512)

Stardist

pre

Stardist

bin

Figure 5: Comparison of segmentation obtained with Stardist

pre

(Beghin et al., 2022), Stardist

bin

and our Stardist

model

trained with synthetic volumes.

us to train the FCGAN

model.

Finally, we generated I

synth

, synthetic confocal-

styled volumes (Fig 4) obtained by applying FCGAN

to the transformed masks dt(M

) of the 3D GT

dataset M

. This was carried out after ensuring that

the sizes of the masks composing dt(M

) were sim-

ilar to the sizes of the nuclei present in I

. We then

trained a 3D Stardist model called Stardist

with the

paired I

synth

volumes and the corresponding 3D masks

4 RESULTS

To assess both qualitatively and quantitatively the seg-

mentation accuracy of our workﬂow, we compared it

to two baselines and used two different cell cultures

acquired with different microscopy modalities. In ad-

dition to I

, we thus acquired with a spinning-disc

confocal microscope 4 suspended S180-E-cad–GFP

cells (Fu et al., 2022) spheroids which nuclei were

stained with DAPI (denoted as I

). Indeed, the vol-

umes composing I

are highly challenging because of

their crowded overlapping nuclei population and the

overall level of noise. This makes them unsuitable to

evaluate the segmentation accuracy of our workﬂow

other than qualitatively, since it is even difﬁcult by

eye to determine the limits of each nucleus. On the

contrary, I

were composed of a lower number of

cells over a shorter thickness (5-10 µm) ensuring to

have a sparser distribution of nuclei per volume (90

nuclei for the 4 volumes) easier to asses. On the side

of the segmentation, and in addition to Stardist

pre

the second baseline was obtained by training a 3D

Stardist model Stardist

bin

to learn how to reconstruct

3D instance masks from binarized masks, i.e. by pair-

ing M

with its binarization. 3D segmentation with

Stardist

bin

were therefore done by applying the model

directly on the binarized 2D masks obtained with the

pre-trained Stardist

model on I

and I

4.1 Qualitative Assessment with I

Applying Stardist

pre

(Beghin et al., 2022) gave poor

results with the volumes composing I

for two rea-

sons (Fig. 5). First, this model was trained with masks

smaller than the nuclei present in I

, resulting in over-

segmentation. Second, due to the different imaging

modality, I

exhibited more noise with its nuclei hav-

ing a different texture than the ones from I

. It re-

sulted with Stardist

pre

missing a lot of nuclei when

applied to I

. On the other hand, Stardist

bin

man-

aged to segment more nuclei than Stardist

pre

since the

2D segmentation provided by Stardist

managed to

identify more nuclei on a per frame basis. Neverthe-

less, results shows that Stardist

bin

struggles to handle

isolated 2D masks, resulting in over-segmented small

nuclei (Fig. 6 and Fig. 5). On the contrary, the seg-

mentation provided by Stardist

gave better qualita-

tive results than Stardist

pre

and Stardist

bin

(Fig. 5).

In particular, Stardist

managed to segment nuclei

exhibiting a wide range of intensity values, from dim

to intense, while preventing over-segmentation at the

same time.

4.2 Quantitative Assessment with I

Contrary to I

and I

that exhibited cells having a

bright and homogeneous texture, I

is composed of

cells having a bright region surrounded by a dimer re-

gion which corresponds to the cell cytoplasm (Fig. 7-

left). Consequently, even if Stardist

pre

managed to

identify more than 90% of the nuclei (Table 1), the

overall segmentation is imperfect. While the model

identiﬁed almost all the brightest nuclei (Fig. 7), ex-

plaining the high number of True Positives (TP), False

Positives (FP) results from the model separating the

bright and dim regions of a cell as 2 objects. False

Negatives (FN), on the other hand, mostly originates

from some nuclei exhibiting a different texture, for

which the model only identiﬁed a small part of the

membrane. Stardist

bin

identiﬁed almost 99% of the

3D Nuclei Segmentation by Combining GAN Based Image Synthesis and Existing 3D Manual Annotations

269

Figure 6: Stardist

bin

fails to properly handle isolated 2D

masks by over-segmenting them.

nuclei present in I

(Table 1), thanks to Stardist

performing well in segmenting the nuclei in this low

density nuclei distribution. Nevertheless, the seg-

mentation being done in a per-frame basis, it led to

the wrong identiﬁcation of some cytoplasm as nu-

cleus, explaining the large number of FP. Finally,

we used our workﬂow to train a new Stardist 3D

model Stardist

with paired of synthetic volumes

synth

(generated by transferring the style of I

to the

transformed masks dt(M

)) and masks M

. Sim-

ilarly to Stardist

bin

, Stardist

successfully detected

99% of the nuclei (Table 1). Nevertheless, directly ac-

counting the axial direction for segmentation, in com-

parison to merging 2D masks to 3D as in Stardist

bin

ended up with a lower number of errors with only 3

FP resulting from over-segmented cells.

4.3 Discussion

Our workﬂow relies heavily on the FCGAN image

synthesis capabilities which are, in our context, much

more reliable than a CycleGAN thanks to the FC-

GAN’s multi-scale feature and paired training dataset.

We have shown that using the pre-trained Stardist

model directly on the acquired microscopy images

and modifying the obtained instance masks was sufﬁ-

cient to properly train a FCGAN model. On the con-

Table 1: Comparison of the segmentation accuracy between

Stardist

pre

, Stardist

bin

and Stardist

(Data 1, n = 33; Data

2, n = 23; Data 3, n = 11; Data 4, n = 23.

Stardist

pre

Stardist

bin

Stardist

TP FP FN TP FP FN TP FP FN

Data 1 97% 15% 3% 100% 0% 0% 97% 9% 3%

Data 2 91% 13% 8% 100% 30% 0% 100% 0% 0%

Data 3 91% 27% 9% 100% 36% 0% 100% 0% 0%

Data 4 87% 9% 13% 96% 17% 4% 100% 0% 0%

Total 92% 14% 8% 99% 17% 1% 99% 3% 1%

Figure 7: Comparison of the segmentation provided by

Stardist

pre

, Stardist

bin

and Stardist

, scale bar = 5 µm.

Red arrows pinpoint regions with segmentation errors.

trary, the segmentation provided by Stardist

was

not accurate enough to directly train a Stardist model

in 3D such as Stardist

bin

. This discrepancy is related

to the fact that Stardist

does not account for the ax-

ial direction, with several nuclei being therefore only

segmented over one or two frames. It leads to a nu-

clei over-segmentation by Stardist

bin

, and therefore

a possibly larger number of FP and FN. In our work-

ﬂow, Stardist

is only used to learn the style transfer.

Creation of the synthetic volumes is done by apply-

ing the new style to dt(M

), a dataset composed of

hand annotated masks in which all the nucleus are ac-

curately identiﬁed. Since our models Stardist

and

Stardist

are trained with these synthetic volumes,

they are mostly insensitive to the segmentation ac-

curacy of Stardist

and are tailored to segment nu-

clei having the same texture as the acquisitions they

have been trained on, therefore outperforming both

Stardist

bin

and Stardist

pre

Importantly, our synthetic volumes I

synth

and I

synth

are composed of stacked generated 2D frames that

does not guarantee intensity coherency in the axial di-

rection (Fig. 8). They also lack smooth transitions be-

tween some appearing and disappearing nuclei as one

would expect from a real acquisition. Nevertheless,

our results conﬁrm precedent ﬁndings (Baniukiewicz

et al., 2019; Wu et al., 2023) for which training seg-

mentation networks with synthetic volumes allows to

achieve good performance.

As already explained by the authors of

Stardist (Schmidt et al., 2018; Weigert et al.,

2020) and Cellpose (Stringer et al., 2021), we want

to reinforce the fact that it is critical to train these

models with data that have objects of similar sizes

and features than the images we want to segment.

Training Stardist with synthetic volumes as proposed

in our workﬂow facilitates this process. It becomes

indeed sufﬁcient to resize M

to generate images

with nuclei exhibiting the same sizes and textures

than the ones composing the acquired volumes we

BIOIMAGING 2024 - 11th International Conference on Bioimaging

270

Frame n

Frame n + 1

Instance masks

Synthetic images

Figure 8: As synthetic volumes are composed of stack gen-

erated frames, a lack of smooth transition between appear-

ing and disappearing nuclei is visible (red arrows), scale bar

= 10 µm.

want to segment.

Finally, the border detection of the cells compos-

ing I

by Stardist

was not perfect and can be ex-

plained by two reasons (Fig. 7). First, the 3D seg-

mentation provided by Stardist is highly convex while

exhibits cells with irregular and possibly concave

borders. Second, I

is composed of volumes with

saturated intensity, resulting in a compressed dynam-

ics. This combined with the small number of volumes

prevented FCGAN to learn a perfect style mapping.

5 CONCLUSION

In this work, our objective was not to develop a

method achieving state of the art accuracy for seg-

menting nuclei in 3D. In the past years (Mougeot

et al., 2022), nuclei segmentation has seen the de-

velopment of hundreds of methods competing for

the leading position in term of segmentation accu-

racy. Unfortunately, most of these techniques are out

of reach for life science labs and imaging facilities

because they can be limited to one operating sys-

tem or do not provide source code, tutorial or toy

datasets (Mougeot et al., 2022). We therefore wanted

to show that it is also possible to achieve good qualita-

tive segmentation by using already established meth-

ods such as Stardist (Schmidt et al., 2018; Weigert

et al., 2020) that are widely used in labs and facilities.

Image synthesis with GANs is still, however, far from

being easily accessible to life scientists. We there-

fore focused on a GAN architecture designed for the

multi-scale organization of biological images. The

FCGAN of Han (Han et al., 2020) was therefore an

ideal choice as it allowed us to directly use our ac-

Figure 9: From the same mask image, two synthetic images

were generated with a different style.

quired microscopy images without cropping or ﬁne-

tuning parameters of the model.

All things considered, our workﬂow resulted in

qualitatively good segmentation of microscopy vol-

umes for which no GT existed. We were able to

generate synthetic volumes having the style of dif-

ferent cell types and microscopy modalities from the

same set of 3D GT masks (Fig. 9). Pairing them to-

gether, we trained several Stardist models tailored for

each acquisition, managing to segment datasets with-

out spending weeks in annotating volumes. We also

quantitatively demonstrated that our workﬂow per-

formed better than a pre-trained Stardist 3D model on

a limited set of 4 volumes. In the future, we plan to

push further this quantiﬁcation as wells as further test

its generalization capability by acquiring new datasets

having a higher complexity than the I

dataset. We

also plan to test Omnipose (Cutler et al., 2022) in our

workﬂow, with the aim to better identify irregular cell

borders.

Another point to consider is that the image syn-

thesis provided by FCGAN will always be heavily

dependent on the quality of the nuclei 2D segmen-

tation. In our case, the pre-trained Stardist

model

gave satisfactory segmentation for feeding the FC-

GAN model. Otherwise, it will be necessary to gener-

ate new 2D manual annotations, a task still a lot easier

than 3D annotations that can be accelerated by tech-

niques such as SAM (Kirillov et al., 2023).

Finally, we think that this workﬂow could be

improved by further manipulating the existing GT

masks. GANs require similar distributions of the ob-

jects morphology and spatial organization between

the source and target styles to generate realistic im-

ages (Liu et al., 2020). Similarly, Stardist would also

beneﬁt from training with pairs having the same dis-

tributions that the volumes to segment. This requires

the ability to quantify the differences between these

distributions, as it would allow to modify the GT

masks to make them similar to the objects one would

want to segment.

3D Nuclei Segmentation by Combining GAN Based Image Synthesis and Existing 3D Manual Annotations

271

REFERENCES

Baniukiewicz, P., Lutton, E. J., Collier, S., and Bretschnei-

der, T. (2019). Generative adversarial networks for

augmenting training data of microscopic cell images.

Frontiers in Computer Science, 1.

Beghin, A., Grenci, G., Sahni, G., Guo, S., Rajendiran, H.,

Delaire, T., Mohamad Rafﬁ, S. B., Blanc, D., de Mets,

R., Ong, H. T., Galindo, X., Monet, A., Acharya, V.,

Racine, V., Levet, F., Galland, R., Sibarita, J.-B., and

Viasnoff, V. (2022). Automated high-speed 3d imag-

ing of organoid cultures with multi-scale phenotypic

quantiﬁcation. Nature Methods, 19(7):881–892.

Caicedo, J. C., Goodman, A., Karhohs, K. W., Cimini,

B. A., Ackerman, J., Haghighi, M., Heng, C., Becker,

T., Doan, M., McQuin, C., Rohban, M., Singh, S., and

Carpenter, A. E. (2019). Nucleus segmentation across

imaging experiments: the 2018 data science bowl. Na-

ture Methods, 16(12):1247–1253.

Chen, Q. and Koltun, V. (2017). Photographic image syn-

thesis with cascaded reﬁnement networks.

Cutler, K. J., Stringer, C., Lo, T. W., Rappez, L., Strous-

trup, N., Brook Peterson, S., Wiggins, P. A., and

Mougous, J. D. (2022). Omnipose: a high-precision

morphology-independent solution for bacterial cell

segmentation. Nature Methods, 19(11):1438–1448.

Fu, C., Arora, A., Engl, W., Sheetz, M., and Viasnoff, V.

(2022). Cooperative regulation of adherens junction

expansion through epidermal growth factor receptor

activation. Journal of Cell Science, 135(4):jcs258929.

Fu, C., Lee, S., Ho, D. J., Han, S., Salama, P., Dunn, K. W.,

and Delp, E. J. (2018). Three dimensional ﬂuores-

cence microscopy image synthesis and segmentation.

In 2018 IEEE/CVF Conference on Computer Vision

and Pattern Recognition Workshops (CVPRW), pages

2302–23028.

Galland, R., Grenci, G., Aravind, A., Viasnoff, V., Studer,

V., and Sibarita, J.-B. (2015). 3d high- and super-

resolution imaging using single-objective spim. Na-

ture Methods, 12(7):641–644.

Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., Xu, B.,

Warde-Farley, D., Ozair, S., Courville, A., and Ben-

gio, Y. (2014). Generative adversarial networks.

Han, L., Murphy, R. F., and Ramanan, D. (2020). Learning

generative models of tissue organization with super-

vised gans.

Isola, P., Zhu, J.-Y., Zhou, T., and Efros, A. A. (2018).

Image-to-image translation with conditional adversar-

ial networks.

Jensen, C. and Teng, Y. (2020). Is it time to start transition-

ing from 2d to 3d cell culture? Frontiers in Molecular

Biosciences, 7.

Kapalczynska, M., Kolenda, T., Przybyla, W., Za-

jaczkowska, M., Teresiak, A., Filas, V., Ibbs, M., Bliz-

niak, R., Luczewski, L., and Lamperska, K. (2018). 2d

and 3d cell cultures – a comparison of different types

of cancer cell cultures. Archives of Medical Science,

14(4):910–919.

Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C.,

Gustafson, L., Xiao, T., Whitehead, S., Berg, A. C.,

Lo, W.-Y., Doll

ar, P., and Girshick, R. (2023). Seg-

ment anything.

Li, G., Liu, T., Tarokh, A., Nie, J., Guo, L., Mara, A., Hol-

ley, S., and Wong, S. T. (2007). 3d cell nuclei seg-

mentation based on gradient ﬂow tracking. BMC Cell

Biology, 8(1):40.

Liu, Q., Gaeta, I. M., Millis, B. A., Tyska, M. J., and

Huo, Y. (2020). Gan based unsupervised segmenta-

tion: Should we match the exact number of objects.

Long, J., Yan, Z., Peng, L., and Li, T. (2021). The geometric

attention-aware network for lane detection in complex

road scenes. PLOS ONE, 16(7):1–15.

Malpica, N., de Sol

orzano, C. O., Vaquero, J. J., Santos,

A., Vallcorba, I., Garc

ıa-Sagredo, J. M., and del Pozo,

F. (1997). Applying watershed algorithms to the seg-

mentation of clustered nuclei. Cytometry, 28(4):289–

297.

Mougeot, G., Dubos, T., Chausse, F., P

ery, E., Grau-

mann, K., Tatout, C., Evans, D. E., and Desset, S.

(2022). Deep learning – promises for 3D nuclear

imaging: a guide for biologists. Journal of Cell Sci-

ence, 135(7):jcs258986.

Ronneberger, O., Fischer, P., and Brox, T. (2015). U-

net: Convolutional networks for biomedical image

segmentation. In Navab, N., Hornegger, J., Wells,

W. M., and Frangi, A. F., editors, Medical Image Com-

puting and Computer-Assisted Intervention – MICCAI

2015, pages 234–241, Cham. Springer International

Publishing.

Schmidt, U., Weigert, M., Broaddus, C., and Myers, G.

(2018). Cell detection with star-convex polygons.

In Frangi, A. F., Schnabel, J. A., Davatzikos, C.,

Alberola-L

opez, C., and Fichtinger, G., editors, Med-

ical Image Computing and Computer Assisted In-

tervention – MICCAI 2018, pages 265–273, Cham.

Springer International Publishing.

Stringer, C., Wang, T., Michaelos, M., and Pachitariu, M.

(2021). Cellpose: a generalist algorithm for cellular

segmentation. Nature Methods, 18(1):100–106.

Wang, J., Tabassum, N., Toma, T. T., Wang, Y., Gahlmann,

A., and Acton, S. T. (2022). 3D GAN image synthesis

and dataset quality assessment for bacterial bioﬁlm.

Bioinformatics, 38(19):4598–4604.

Weigert, M., Schmidt, U., Haase, R., Sugawara, K., and

Myers, G. (2020). Star-convex polyhedra for 3d object

detection and segmentation in microscopy. In 2020

IEEE Winter Conference on Applications of Computer

Vision (WACV), pages 3655–3662, Los Alamitos, CA,

USA. IEEE Computer Society.

Wu, L., Chen, A., Salama, P., Winfree, S., Dunn, K. W.,

and Delp, E. J. (2023). Nisnet3d: three-dimensional

nuclear synthesis and instance segmentation for ﬂu-

orescence microscopy images. Scientiﬁc Reports,

13(1):9533.

Zhu, J.-Y., Park, T., Isola, P., and Efros, A. A. (2017).

Unpaired image-to-image translation using cycle-

consistent adversarial networks. In Computer Vision

(ICCV), 2017 IEEE International Conference on.

BIOIMAGING 2024 - 11th International Conference on Bioimaging

272