Latent-space Laplacian Pyramids for Adversarial
Representation Learning with 3D Point Clouds
Vage Egiazarian
1
, Savva Ignatyev
1
, Alexey Artemov
1
,
Oleg Voynov
1
, Andrey Kravchenko
2
, Youyi Zheng
3
, Luiz Velho
4
and Evgeny Burnaev
1
1
Skolkovo Institute of Science and Technology, Moscow, Russia
2
DeepReason.ai, Oxford, U.K.
3
State Key Lab, Zhejiang University, China
4
IMPA, Brazil
andrey.kravchenko@deepreason.ai, youyizheng@zju.edu.cn, lvelho@impa.br, e.burnaev@skoltech.ru
Keywords:
Deep Learning, 3D Point Clouds, Generative Adversarial Networks, Multi-scale 3D Modelling.
Abstract:
Constructing high-quality generative models for 3D shapes is a fundamental task in computer vision with
diverse applications in geometry processing, engineering, and design. Despite the recent progress in deep
generative modelling, synthesis of finely detailed 3D surfaces, such as high-resolution point clouds, from
scratch has not been achieved with existing learning-based approaches. In this work, we propose to employ
the latent-space Laplacian pyramid representation within a hierarchical generative model for 3D point clouds.
We combine the latent-space GAN and Laplacian GAN architectures proposed in the recent years to form a
multi-scale model capable of generating 3D point clouds at increasing levels of detail. Our initial evaluation
demonstrates that our model outperforms the existing generative models for 3D point clouds, emphasizing the
need for an in-depth comparative study on the topic of multi-stage generative learning with point clouds.
1 INTRODUCTION
A point cloud is an ubiquitous data structure that has
gained a strong presence in the last few decades with
the widespread use of range sensors. Point clouds
are sets of points in 3D space, commonly produced
by range measurements with 3D scanners (e.g. LI-
DARs, RGB-D cameras, and structured light scan-
ners) or computed using stereo-matching algorithms.
A key use-case with point clouds is 3D surface recon-
struction involved in many applications such as re-
verse engineering, cultural heritage conservation, or
digital urban planning.
Unfortunately, for most scanners the raw 3D mea-
surements often cannot be used in their original form
for surface/shape reconstruction, as they are generally
prone to noise and outliers, non-uniform, and incom-
plete. Whilst constant progress in scanning technol-
ogy has led to improvements in some aspects of data
quality, others, such as occlusion, remain a persistent
issue for objects with complex geometry. Thus, a cru-
cial step in 3D geometry processing is to model full
3D shapes from their sampling as 3D point clouds,
inferring their geometric characteristics from incom-
Figure 1: The proposed latent-space Laplacian GAN oper-
ates on latent representations, bypassing the need to process
large-scale 3D point clouds. Through using mutiple stages
of the pyramid, the sample resolution can be iteratively in-
creased.
plete and noisy measurements. A recent trend in this
direction is to apply data-driven methods such as deep
generative models (Achlioptas et al., 2018; Li et al.,
2018a; Chen et al., 2019).
However, most known deep models operate di-
rectly in the original space of raw measurements,
which represents a challenging task due to the om-
Egiazarian, V., Ignatyev, S., Artemov, A., Voynov, O., Kravchenko, A., Zheng, Y., Velho, L. and Burnaev, E.
Latent-space Laplacian Pyramids for Adversarial Representation Lear ning with 3D Point Clouds.
DOI: 10.5220/0009102604210428
In Proceedings of the 15th Inter national Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2020) - Volume 4: VISAPP, pages
421-428
ISBN: 978-989-758-402-2; ISSN: 2184-4321
Copyright
c
2022 by SCITEPRESS Science and Technology Publications, Lda. All rights reserved
421
nipresent redundancy in raw data; instead, it might be
beneficial to encode characteristic shape features in
the latent space and further operate with latent repre-
sentations. Another shortcoming is that most models
only operate with coarse (low-resolution) 3D geome-
try, as high-resolution 3D shapes are computationally
demanding to process and challenging to learn from.
In this work, we consider the task of learning 3D
shape representations given their 3D point cloud sam-
ples. To this end, we develop a novel deep cascade
model, Latent-Space Laplacian Pyramid GAN, or
LSLP-GAN, taking advantage of the recent progress
in adversarial learning with point clouds (Achliop-
tas et al., 2018) and deep multi-scale models (Den-
ton et al., 2015; Mandikal and Radhakrishnan, 2019).
Our proposed model (schematically represented in
Figure 1) has several important distinctions: (i) it
is generative and can be used to produce synthetic
shapes unseen during training; (ii) it is able to produce
high-resolution point clouds via a latent space Lapla-
cian pyramid representation; (iii) it is easy to train
as it operates in the space of latent codes, bypassing
the need to perform implicit dimensionality reduction
during training. We train our model in an adversar-
ial regime using a collection of 3D point clouds with
multiple resolutions.
In this work, our contributions are as follows:
1. We propose LSLP-GAN, a novel multi-scale deep
generative model for shape representation learn-
ing with 3D point clouds.
2. We demonstrate by the means of numerical exper-
iments the effectiveness of our proposed method
for shape synthesis and upsampling tasks.
2 RELATED WORK
Neural Networks on Unstructured 3D Point
Clouds. Despite deep convolutional neural net-
works (CNNs) have proved themselves to be very ef-
fective for learning with 2D grid-structured images,
until very recently, the same did not hold true with
unstructured 3D point clouds. The basic building
blocks for point-based architectures (equivalents of
the spatial convolution) are not straightforward to im-
plement. To this end, MLP-based (Qi et al., 2017a;
Qi et al., 2017b), graph convolutional (Wang et al.,
2018), and 3D convolutional point networks (Hua
et al., 2018; Li et al., 2018c; Atzmon et al., 2018) have
been proposed, each implementing their own notion
of convolution, and applied to classification, semantic
labelling, and other tasks (see (Uy et al., 2019) for a
survey). We adopt the building blocks of (Qi et al.,
2017a; Achlioptas et al., 2018) as a basis for the ar-
chitecture of our autoencoders. Building on top of
the success of point convolutions, auto-encoding net-
works have been proposed (Achlioptas et al., 2018;
Yang et al., 2018; Li et al., 2018b) to learn efficient
latent representations.
Generative Neural Networks For 3D Mod-
elling. Literature on generative learning with 3D
shapes shows instances of variational auto-encoders
(VAEs) (Kingma and Welling, 2013) and gener-
ative adversarial networks (GANs) (Goodfellow
et al., 2014) applied to 3D shape generation. On
the one hand, VAEs have been demonstrated to
efficiently model images (Kingma and Welling,
2013), voxels (Brock et al., 2016), and properties
of partially-segmented point clouds (Nash and
Williams, 2017), learning semantically meaningful
data representations. On the other hand, GANs
have been studied in the context of point set gen-
eration from images (Fan et al., 2017), multi-view
reconstruction (Lin et al., 2018), volumetric model
synthesis (Wu et al., 2016), and, more recently,
point cloud processing (Achlioptas et al., 2018; Li
et al., 2018a). However, neither of the mentioned
approaches provides accurate multi-scale modelling
of 3D representation with high resolution, which has
been demonstrated to drastically improve image syn-
thesis with GANs (Denton et al., 2015). Additionally,
in several instances, for point cloud generation, some
form of input (e.g., an image) is required (Fan et al.,
2017). In contrast, we are able to generate highly
detailed point clouds unconditionally.
Multi-scale Neural Networks on 3D Shapes. For
2D images, realistic synthesis with GANs has first
been proposed in (Denton et al., 2015) with a coarse-
to-fine Laplacian pyramid image representation. Sur-
prisingly, little work of similar nature has been done
in multi-scale point cloud modelling. (Mandikal and
Radhakrishnan, 2019) propose a deep pyramid neural
network for point cloud upsampling. However, their
model accepts RGB images as input and is unable to
synthesise novel 3D shapes. We, however, operate di-
rectly on unstructured 3D point sets and provide a full
generative model for 3D shapes. (Yifan et al., 2019)
operate on patches in a progressive manner, thus im-
plicitly representing a multi-scale approach. How-
ever, their model specifically tackles the point set up-
sampling task and cannot produce novel point clouds.
In constrast, our generative model purposefully learns
mutually coherent representations of full 3D shapes
at multiple scales through the use of a latent-space
Laplacian pyramid.
VISAPP 2020 - 15th International Conference on Computer Vision Theory and Applications
422
concat
g
0
<latexit sha1_base64="tkWr7c4mMDG/1GkOwvpsJbCrHME=">AAACG3icbVDLTgIxFG3xhfgCXbppJCauyIya6JLoxiVGXgkQ0ikXaOi0k7ZjQiZ8gltd+TXujFsX/o0dmIWAJ2lycu49996eIBLcWM/7wbmNza3tnfxuYW//4PCoWDpuGhVrBg2mhNLtgBoQXELDciugHWmgYSCgFUzu03rrGbThStbtNIJeSEeSDzmj1klPo77XL5a9ijcHWSd+RsooQ61fwrnuQLE4BGmZoMZ0fC+yvYRqy5mAWaEbG4gom9ARdByVNATTS+a3zsi5UwZkqLR70pK5+teR0NCYaRi4zpDasVmtpeJ/tU5sh7e9hMsotiDZYtEwFsQqkn6cDLgGZsXUEco0d7cSNqaaMuviKSytCZSaWBoYN4TUnX3M2QQkcZIAkiZtCi41fzWjddK8rPhXlcvH63L1Lssvj07RGbpAPrpBVfSAaqiBGBqhF/SK3vA7/sCf+GvRmsOZ5wQtAX//AqGmnyA=</latexit>
G
0
<latexit sha1_base64="negiSpKeNrQrgyPTzCWzcs3kk5s=">AAACG3icbVDLTgIxFO3gC8cX6NJNIzFxRWbQRJdEF7rEyCuBCemUCzR02knbMSGET3CrK7/GnXHrwr+xA7MQ8CRNTs69597bE8acaeN5P05uY3Nreye/6+7tHxweFYrHTS0TRaFBJZeqHRINnAloGGY4tGMFJAo5tMLxXVpvPYPSTIq6mcQQRGQo2IBRYqz0dN/zeoWSV/bmwOvEz0gJZaj1ik6u25c0iUAYyonWHd+LTTAlyjDKYeZ2Ew0xoWMyhI6lgkSgg+n81hk+t0ofD6SyTxg8V/86piTSehKFtjMiZqRXa6n4X62TmMFNMGUiTgwIulg0SDg2Eqcfx32mgBo+sYRQxeytmI6IItTYeNylNaGUY0NCbYfgurWPGB2DwFbigNOktWtT81czWifNStm/LFcer0rV2yy/PDpFZ+gC+egaVdEDqqEGomiIXtArenPenQ/n0/latOaczHOCluB8/wJqxp8A</latexit>
input X
0
<latexit sha1_base64="Vab2zPxuPOX580pwlo4DAmGcTkA=">AAACK3icbVDLTgIxFO3gC/EFunTTSExcGDKjJrokunGJCa8ECOmUCzR02kl7RyWET3GrK7/Glcat/2F5LAQ8SZOTc++59/aEsRQWff/TS62tb2xupbczO7t7+wfZ3GHV6sRwqHAttamHzIIUCiooUEI9NsCiUEItHNxN6rVHMFZoVcZhDK2I9ZToCs7QSe1sronwjCOh4gSb5+N6229n837Bn4KukmBO8mSOUjvnpZodzZMIFHLJrG0EfoytETMouIRxpplYiBkfsB40HFUsAtsaTW8f01OndGhXG/cU0qn61zFikbXDKHSdEcO+Xa5NxP9qjQS7N63Zx0Dx2aJuIilqOgmCdoQBjnLoCONGuFsp7zPDOLq4MgtrQq0HyELrhtCys/cFH4CiTpJAJ8nbjEstWM5olVQvCsFl4eLhKl+8neeXJsfkhJyRgFyTIrknJVIhnDyRF/JK3rx378P78r5nrSlv7jkiC/B+fgE0pKWf</latexit>
noise z
0
<latexit sha1_base64="XtuzENxLxWrWKEUGr6vynTVkDxk=">AAACK3icbVDLSgMxFM3U9/hqdekmWAQXUmZU0GXRjUuFPoR2KJn01oZmkiG5o9ahn+JWV36NK8Wt/2H6WPg6EDicc0/u5cSpFBaD4M0rzM0vLC4tr/ira+sbm8XSVsPqzHCocy21uY6ZBSkU1FGghOvUAEtiCc14cD72m7dgrNCqhsMUooTdKNETnKGTOsVSG+Eec6WFhfbB6KETdIrloBJMQP+ScEbKZIbLTskrtLuaZwko5JJZ2wqDFKOcGRRcwshvZxZSxgfsBlqOKpaAjfLJ7SO655Qu7WnjnkI6Ub8ncpZYO0xiN5kw7Nvf3lj8z2tl2DuNcqHSDEHx6aJeJilqOi6CdoUBjnLoCONGuFsp7zPDOLq6/B9rYq0HyGLrPqE1F+8LPgBFnSSBjpu3vmst/N3RX9I4rIRHlcOr43L1bNbfMtkhu2SfhOSEVMkFuSR1wskdeSRP5Nl78V69d+9jOlrwZplt8gPe5xdP06Wv</latexit>
latent h
0
<latexit sha1_base64="rvqAfvC/lWgWo6uQwOsk9bTR56I=">AAACLHicbVDLTgIxFO3gC/EFuHTTSExcGDKDJrokunGJCa8ECOmUCzR0ppP2joFM+BW3uvJr3Bjj1u+wPBYCnqTJyTn31eNHUhh03U8ntbW9s7uX3s8cHB4dn2Rz+bpRseZQ40oq3fSZASlCqKFACc1IAwt8CQ1/9DDzG8+gjVBhFScRdAI2CEVfcIZW6mbzbYQxJpIhhNi+mg67bjdbcIvuHHSTeEtSIEtUujkn1e4pHgd2BJfMmJbnRthJmEbBJUwz7dhAxPiIDaBlacgCMJ1kfvyUXlilR/tK2xcinat/OxIWGDMJfFsZMByadW8m/ue1YuzfdRIRRrH9G18s6seSoqKzJGhPaOAoJ5YwroW9lfIh04yjzSuzssZXaoTMN3YIrdr2oeAjCKmVJNBZ9CZjU/PWM9ok9VLRuy6Wnm4K5ftlfmlyRs7JJfHILSmTR1IhNcLJmLyQV/LmvDsfzpfzvShNOcueU7IC5+cXEkymEQ==</latexit>
coarse
e
X
k
<latexit sha1_base64="3gfW/Z8G07AYXxbr+bNW0Ob2j/o=">AAACB3icbVBNS8NAEN34WetX1aMgwSJ4kJJUQY9FLx4r2A9oQthspu3SzQe7E7WE3rz4V7x4UMSrf8Gb/8Ztm4O2Phh4vDfDzDw/EVyhZX0bC4tLyyurhbXi+sbm1nZpZ7ep4lQyaLBYxLLtUwWCR9BAjgLaiQQa+gJa/uBq7LfuQCoeR7c4TMANaS/iXc4oaskrHTgID5ixmEoFzsnIuecBIBcBZO2RN/BKZatiTWDOEzsnZZKj7pW+nCBmaQgRMkGV6thWgm5GJXImYFR0UgUJZQPag46mEQ1Budnkj5F5pJXA7MZSV4TmRP09kdFQqWHo686QYl/NemPxP6+TYvfCzXiUpAgRmy7qpsLE2ByHYgZcAkMx1IQyyfWtJutTSRnq6Io6BHv25XnSrFbs00r15qxcu8zjKJB9ckiOiU3OSY1ckzppEEYeyTN5JW/Gk/FivBsf09YFI5/ZI39gfP4AOGmaKg==</latexit>
f
k
<latexit sha1_base64="hTPOTJSRuVYITH+j4Q7xCogl060=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0mqoMeiF48V7Qe0oWy2k3bpZhN2N0IJ/QlePCji1V/kzX/jts1BWx8MPN6bYWZekAiujet+O4W19Y3NreJ2aWd3b/+gfHjU0nGqGDZZLGLVCahGwSU2DTcCO4lCGgUC28H4dua3n1BpHstHM0nQj+hQ8pAzaqz0EPbH/XLFrbpzkFXi5aQCORr98ldvELM0QmmYoFp3PTcxfkaV4UzgtNRLNSaUjekQu5ZKGqH2s/mpU3JmlQEJY2VLGjJXf09kNNJ6EgW2M6JmpJe9mfif101NeO1nXCapQckWi8JUEBOT2d9kwBUyIyaWUKa4vZWwEVWUGZtOyYbgLb+8Slq1qndRrd1fVuo3eRxFOIFTOAcPrqAOd9CAJjAYwjO8wpsjnBfn3flYtBacfOYY/sD5/AFJeo3M</latexit>
noise z
k
<latexit sha1_base64="x0wnQQ3ioul5fEIq9cyhAuHaGjQ=">AAAB+nicbVBNS8NAEN34WetXqkcvwSJ4kJJUQY9FLx4r2A9oQ9lsp+3SzSbsTtQa+1O8eFDEq7/Em//GbZuDtj4YeLw3w8y8IBZco+t+W0vLK6tr67mN/ObW9s6uXdir6yhRDGosEpFqBlSD4BJqyFFAM1ZAw0BAIxheTfzGHSjNI3mLoxj8kPYl73FG0Ugdu9BGeMBURlxD+2T82Bl27KJbcqdwFomXkSLJUO3YX+1uxJIQJDJBtW55box+ShVyJmCcbycaYsqGtA8tQyUNQfvp9PSxc2SUrtOLlCmJzlT9PZHSUOtRGJjOkOJAz3sT8T+vlWDvwk+5jBMEyWaLeolwMHImOThdroChGBlCmeLmVocNqKIMTVp5E4I3//IiqZdL3mmpfHNWrFxmceTIATkkx8Qj56RCrkmV1Agj9+SZvJI368l6sd6tj1nrkpXN7JM/sD5/ANNAlFw=</latexit>
latent
e
h
k
<latexit sha1_base64="dbT/xBN6b2mOEShbPiRctyrXF6w=">AAACB3icbVDLSsNAFJ34rPUVdSlIsAgupCRV0GXRjcsK9gFNKJPJbTt08mDmRi2hOzf+ihsXirj1F9z5N07bLLT1wMDhnHtn5hw/EVyhbX8bC4tLyyurhbXi+sbm1ra5s9tQcSoZ1FksYtnyqQLBI6gjRwGtRAINfQFNf3A19pt3IBWPo1scJuCFtBfxLmcUtdQxD1yEB8wERYjQPRm59zwA5CKArD/qDDpmyS7bE1jzxMlJieSodcwvN4hZGurbmKBKtR07QS+jEjkTMCq6qYKEsgHtQVvTiIagvGySY2QdaSWwurHUJ0Jrov7eyGio1DD09WRIsa9mvbH4n9dOsXvhZTxKUh2TTR/qpsLC2BqXYgVcAkMx1IQyyfVfLdankjLU1RV1Cc5s5HnSqJSd03Ll5qxUvczrKJB9ckiOiUPOSZVckxqpE0YeyTN5JW/Gk/FivBsf09EFI9/ZI39gfP4AYlqaRQ==</latexit>
residual r
k
<latexit sha1_base64="jrePGD9rIf8583gek4OFJgaBb5g=">AAAB/XicbVBNS8NAEN3Ur1q/6sfNS7AIHqQkVdBj0YvHCvYD2hA222m7dLMJuxOxhuJf8eJBEa/+D2/+G7dtDtr6YODx3gwz84JYcI2O823llpZXVtfy64WNza3tneLuXkNHiWJQZ5GIVCugGgSXUEeOAlqxAhoGAprB8HriN+9BaR7JOxzF4IW0L3mPM4pG8osHHYQHTBVo3k2o6JyOlT/0iyWn7ExhLxI3IyWSoeYXvzrdiCUhSGSCat12nRi9lCrkTMC40Ek0xJQNaR/ahkoagvbS6fVj+9goXbsXKVMS7an6eyKlodajMDCdIcWBnvcm4n9eO8HepZdyGScIks0W9RJhY2RPorC7XAFDMTKEMsXNrTYbUEUZmsAKJgR3/uVF0qiU3bNy5fa8VL3K4siTQ3JETohLLkiV3JAaqRNGHskzeSVv1pP1Yr1bH7PWnJXN7JM/sD5/ACrala0=</latexit>
U
<latexit sha1_base64="PEhrdXtU5tToVGh4QPZ07hn1c/A=">AAAB6HicbVBNS8NAEJ3Ur1q/qh69LBbBU0mqoMeiF48tmFpoQ9lsJ+3azSbsboRS+gu8eFDEqz/Jm//GbZuDtj4YeLw3w8y8MBVcG9f9dgpr6xubW8Xt0s7u3v5B+fCopZNMMfRZIhLVDqlGwSX6hhuB7VQhjUOBD+HoduY/PKHSPJH3ZpxiENOB5BFn1Fip6ffKFbfqzkFWiZeTCuRo9Mpf3X7CshilYYJq3fHc1AQTqgxnAqelbqYxpWxEB9ixVNIYdTCZHzolZ1bpkyhRtqQhc/X3xITGWo/j0HbG1Az1sjcT//M6mYmugwmXaWZQssWiKBPEJGT2NelzhcyIsSWUKW5vJWxIFWXGZlOyIXjLL6+SVq3qXVRrzctK/SaPowgncArn4MEV1OEOGuADA4RneIU359F5cd6dj0VrwclnjuEPnM8fsnuM3Q==</latexit>
g
k
<latexit sha1_base64="P/+oeP82Se9b6GTf3t6bvy0NBds=">AAAB6nicbVBNS8NAEJ3Ur1q/qh69LBbBU0mqoMeiF48V7Qe0oWy2k3TpZhN2N0Ip/QlePCji1V/kzX/jts1BWx8MPN6bYWZekAqujet+O4W19Y3NreJ2aWd3b/+gfHjU0kmmGDZZIhLVCahGwSU2DTcCO6lCGgcC28Hodua3n1BpnshHM07Rj2kkecgZNVZ6iPqjfrniVt05yCrxclKBHI1++as3SFgWozRMUK27npsaf0KV4UzgtNTLNKaUjWiEXUsljVH7k/mpU3JmlQEJE2VLGjJXf09MaKz1OA5sZ0zNUC97M/E/r5uZ8NqfcJlmBiVbLAozQUxCZn+TAVfIjBhbQpni9lbChlRRZmw6JRuCt/zyKmnVqt5FtXZ/Wanf5HEU4QRO4Rw8uII63EEDmsAggmd4hTdHOC/Ou/OxaC04+cwx/IHz+QNLAI3N</latexit>
refined X
k
<latexit sha1_base64="HoyZytSVmD4sjeqNYnXhC4brqxI=">AAAB/HicbVDLSsNAFJ34rPUV7dJNsAgupCRV0GXRjcsK9gFtCZPJTTt08mDmRgyh/oobF4q49UPc+TdO2yy09cCFwzn3ztx7vERwhbb9baysrq1vbJa2yts7u3v75sFhW8WpZNBisYhl16MKBI+ghRwFdBMJNPQEdLzxzdTvPIBUPI7uMUtgENJhxAPOKGrJNSt9hEfMJQT6Ab9/Num6Y9es2jV7BmuZOAWpkgJN1/zq+zFLQ4iQCapUz7ETHORUImcCJuV+qiChbEyH0NM0oiGoQT5bfmKdaMW3gljqitCaqb8nchoqlYWe7gwpjtSiNxX/83opBleDnEdJihCx+UdBKiyMrWkSls8lMBSZJpRJrne12IhKylDnVdYhOIsnL5N2veac1+p3F9XGdRFHiRyRY3JKHHJJGuSWNEmLMJKRZ/JK3own48V4Nz7mrStGMVMhf2B8/gAZd5UN</latexit>
K
<latexit sha1_base64="1TwuO2H5EKYlW+te+LIGl+bra4M=">AAAB73icbVBNS8NAEJ34WetX1aOXxSJ4KkkV9Fj0InipYD+gDWWz3bRLN5u4OxFK6J/w4kERr/4db/4bt20O2vpg4PHeDDPzgkQKg6777aysrq1vbBa2its7u3v7pYPDpolTzXiDxTLW7YAaLoXiDRQoeTvRnEaB5K1gdDP1W09cGxGrBxwn3I/oQIlQMIpWandRRNyQu16p7FbcGcgy8XJShhz1Xumr249ZGnGFTFJjOp6boJ9RjYJJPil2U8MTykZ0wDuWKmrX+Nns3gk5tUqfhLG2pZDM1N8TGY2MGUeB7YwoDs2iNxX/8zophld+JlSSIldsvihMJcGYTJ8nfaE5Qzm2hDIt7K2EDammDG1ERRuCt/jyMmlWK955pXp/Ua5d53EU4BhO4Aw8uIQa3EIdGsBAwjO8wpvz6Lw4787HvHXFyWeO4A+czx+oro+3</latexit>
Dataset
<latexit sha1_base64="g/rIeQE1BljnfX0cwJf/uU9iyG4=">AAAB9XicbVBNSwMxEM36WetX1aOXYBE8ld0q6LGoB48V7Ae0a8mm0zY0m12SWbUs/R9ePCji1f/izX9j2u5BWx8MPN6bSWZeEEth0HW/naXlldW19dxGfnNre2e3sLdfN1GiOdR4JCPdDJgBKRTUUKCEZqyBhYGERjC8mviNB9BGROoORzH4Iesr0ROcoZXu2whPmF4ztE/guFMouiV3CrpIvIwUSYZqp/DV7kY8CUEhl8yYlufG6KdMo+ASxvl2YiBmfMj60LJUsRCMn063HtNjq3RpL9K2FNKp+nsiZaExozCwnSHDgZn3JuJ/XivB3oWfChUnCIrPPuolkmJEJxHQrtDAUY4sYVwLuyvlA6YZRxtU3obgzZ+8SOrlkndaKt+eFSuXWRw5ckiOyAnxyDmpkBtSJTXCiSbP5JW8OY/Oi/PufMxal5xs5oD8gfP5Ax4Ikuk=</latexit>
(optional init i al i zat i on)
<latexit sha1_base64="ehwyvkgP5R4gNocSdU/HwkGBKNY=">AAACCXicbVC7TsMwFHV4lvIKMLJYVEhlqZKCBGMFC2OR6ENqo8px3daq40T2DaJEXVn4FRYGEGLlD9j4G5w0A7QcydLxOefq6h4/ElyD43xbS8srq2vrhY3i5tb2zq69t9/UYawoa9BQhKrtE80El6wBHARrR4qRwBes5Y+vUr91x5TmobyFScS8gAwlH3BKwEg9G3eB3UNSDqP0TwTmkgMngj9kgZNpzy45FScDXiRuTkooR71nf3X7IY0DJoEKonXHdSLwEqKAU8GmxW6sWUTomAxZx1BJAqa9JLtkio+N0seDUJknAWfq74mEBFpPAt8kAwIjPe+l4n9eJ4bBhZdwGcXAJJ0tGsQCQ4jTWnCfK0ZBTAwhVJkKKKYjoggFU17RlODOn7xImtWKe1qp3pyVapd5HQV0iI5QGbnoHNXQNaqjBqLoET2jV/RmPVkv1rv1MYsuWfnMAfoD6/MHZ3eaxw==</latexit>
Figure 2: Full architecture of LSLP-GAN model. The network either accepts or generates an initial point cloud X
0
and
processes it with a series of K learnable steps. Each step (1) upsamples its input using a non-learnable operator U , (2) encodes
the upsampled version into the latent space by f
k
, (3) performs correction of the latent code via a conditional GAN G
k
, and
(4) decodes the corrected latent code using g
k
.
3 FRAMEWORK
Our model for learning with 3D point clouds is in-
spired by works on Latent GANs (Achlioptas et al.,
2018) and Laplacian GANs (Denton et al., 2015).
Therefore, we first briefly describe these approaches.
3.1 Latent GANs and Laplacian GANs
The well-known autoencoding neural networks, or
autoencoders, compute latent codes h R
d
for an in-
put object X X through the means of an encoding
network f (X). This latent representation can later be
decoded in order to reconstruct the original input, via
a decoding network g(h). Latent GANs (Achlioptas
et al., 2018) are built around the idea that real latent
codes commonly occupy only a subspace of their en-
closing space R
d
, i.e. live on a manifold embedded in
R
d
. Thus, it should be possible to synthesise samples
by learning the manifold of real latent codes, presum-
ably an easier task compared to learning with the orig-
inal high-dimensional representations. A GAN model
is thus used in addition to an autoencoder model,
yielding artificial latent codes obtained as a result of
an adversarial game. Such models were recently ap-
plied to learning with point clouds, where they com-
pare favourably with GANs that operate on the raw
input (Achlioptas et al., 2018).
Laplacian GANs (Denton et al., 2015) increase
image resolution in a coarse-to-fine manner during
synthesis, aiming to produce high quality samples of
natural images. This is motivated by the fact that stan-
dard GANs give acceptable results for generation of
low-resolution images but fail for images of higher
resolution. Laplacian GANs overcome this problem
by a cascading image synthesis with a series of gen-
erative networks G
0
,..., G
n
, where each network G
k
learns to generate a high-frequency residual image
r
k
= G
k
(U(I
k+1
),z
k
) conditioned on the upsampled
image I
k
provided by G
k1
. Thus, an image at stage k
is represented via
1
:
I
k
= U(I
k+1
) + G
k
(U(I
k+1
),z
k
), (1)
where U(·) is an upsampling operator and z
k
is a noise
vector. For point clouds, a change in image resolu-
tion transforms to resampling, where subsets of points
may be selected to form a low-resolution 3D shape, or
reconstructed for a higher resolution 3D shape.
3.2 Representation Learning with
Latent-space Laplacian Pyramids
Spaces of 3D Point Clouds. We start with a se-
ries of 3D shape spaces R
n
0
×3
,R
n
1
×3
,..., R
n
K
×3
with
n
0
< n
1
< ... < n
K
(specifically, we set n
k
= 2
k
·n
0
). A
3D shape can be represented in the space R
n
k
×3
with
a 3D point sampling of its surface X
k
= {x
i
}
n
k
i=1
. If
this sampling maintains sufficient uniformity for all k,
then the sequence of 3D point clouds X
0
,X
1
,... repre-
sents a progressively finer model of the surface. Our
intuition in considering the series of shape spaces is
that modelling highly detailed 3D point clouds rep-
resents a challenge due to their high dimensionality.
Thus, it might be beneficial to start with a low-detail
(but easily constructed) model X
0
and decompose the
modelling task into a sequense of more manageable
stages, each aimed at a gradual increase of detail.
1
Note that here, larger values of index k correspond to
coarser pyramid levels (low resolution residuals), while in
our relation (2) below k indexes shapes with increasing res-
olution.
Latent-space Laplacian Pyramids for Adversarial Representation Learning with 3D Point Clouds
423
Training Auto-encoding Point Networks on Mul-
tiple Scales. Learning the manifold of latent codes
has been demonstrated to be beneficial in terms of re-
construction quality (Achlioptas et al., 2018). Mo-
tivated by this observation, we use 3D shape spaces
{R
n
k
×3
}
K
k=1
and construct a series of corresponding
latent spaces {R
d
k
}
K
k=1
by training point autoencoders
{( f
k
,g
k
)}
K
k=1
. Note that an autoencoder ( f
k
,g
k
) is
trained using the resolution n
k
of 3D point clouds,
which grows as k increases. As our method strongly
relies on the quality of autoencoders, we evaluate re-
liability of their mappings from 3D space into latent
space in Section 4. After training the autoencoders,
we fix their parameters and extract latent codes for
shapes in each of the 3D shape spaces.
Laplacian Pyramid in The Spaces of Latent Codes.
For what follows, it is convenient to assume that we
are given as input a point cloud X
k1
R
n
k1
×3
. We
aim to go from X
k1
to X
k
, i.e. to increase resolution
from n
k1
to n
k
= 2n
k1
, generating additional point
samples on the surface of an underlying shape. Fig-
ure 3 illustrates our reasoning schematically.
R
n
k1
×3
R
n
k
×3
f
k
<latexit sha1_base64="1c6U0YWjgU+A4gXc7qAj0S3csIc=">AAACG3icbVDLSgMxFM3UVx1frS7dBIvgqsyooMuiG5cV+4J2KJn0ThsmkwxJRiiln+BWV36NO3Hrwr8xbWdhWw8EDufec+/NCVPOtPG8H6ewsbm1vVPcdff2Dw6PSuXjlpaZotCkkkvVCYkGzgQ0DTMcOqkCkoQc2mF8P6u3n0FpJkXDjFMIEjIULGKUGCs9Rf24X6p4VW8OvE78nFRQjnq/7BR6A0mzBIShnGjd9b3UBBOiDKMcpm4v05ASGpMhdC0VJAEdTOa3TvG5VQY4kso+YfBc/euYkETrcRLazoSYkV6tzcT/at3MRLfBhIk0MyDoYlGUcWwknn0cD5gCavjYEkIVs7diOiKKUGPjcZfWhFLGhoTaDsENax8xGoPAVuKAZ0lr16bmr2a0TlqXVf+qevl4Xand5fkV0Sk6QxfIRzeohh5QHTURRUP0gl7Rm/PufDifzteiteDknhO0BOf7FwS1n1o=</latexit>
g
k
<latexit sha1_base64="s6ntShIBzTbvdGSvUMRic4qF2HA=">AAACG3icbVDLTgIxFG3xhfgCXbppJCauyIya6JLoxiVGXgkQ0ikXaKbTTtqOCSF8gltd+TXujFsX/o0dmIWAJ2lycu49996eIBbcWM/7wbmNza3tnfxuYW//4PCoWDpuGpVoBg2mhNLtgBoQXELDciugHWugUSCgFYT3ab31DNpwJet2EkMvoiPJh5xR66SnUT/sF8texZuDrBM/I2WUodYv4Vx3oFgSgbRMUGM6vhfb3pRqy5mAWaGbGIgpC+kIOo5KGoHpTee3zsi5UwZkqLR70pK5+tcxpZExkyhwnRG1Y7NaS8X/ap3EDm97Uy7jxIJki0XDRBCrSPpxMuAamBUTRyjT3N1K2JhqyqyLp7C0JlAqtDQwbgipO/uYsxAkcZIAkiZtCi41fzWjddK8rPhXlcvH63L1Lssvj07RGbpAPrpBVfSAaqiBGBqhF/SK3vA7/sCf+GvRmsOZ5wQtAX//AgZsn1s=</latexit>
G
k
<latexit sha1_base64="wGBmF91ETI7/NLKOgFw/41/kEdE=">AAACG3icbVDLTgIxFO3gC/EFunTTSExckRk00SXRhS4x8kpgQjrlDjTTaSdtx4QQPsGtrvwad8atC//GDrAQ8CRNTs69597bEyScaeO6P05uY3Nreye/W9jbPzg8KpaOW1qmikKTSi5VJyAaOBPQNMxw6CQKSBxwaAfRXVZvP4PSTIqGGSfgx2QoWMgoMVZ6uu9H/WLZrbgz4HXiLUgZLVDvl5xcbyBpGoMwlBOtu56bGH9ClGGUw7TQSzUkhEZkCF1LBYlB+5PZrVN8bpUBDqWyTxg8U/86JiTWehwHtjMmZqRXa5n4X62bmvDGnzCRpAYEnS8KU46NxNnH8YApoIaPLSFUMXsrpiOiCDU2nsLSmkDKyJBA2yG4Ye0jRiMQ2EoccJa0LtjUvNWM1kmrWvEuK9XHq3LtdpFfHp2iM3SBPHSNaugB1VETUTREL+gVvTnvzofz6XzNW3POwnOCluB8/wLPfZ87</latexit>
R
d
k
<latexit sha1_base64="PJQAUNCXsdRAdOdk7SK1jcAJ+n0=">AAACKnicbVDLTgIxFO3gC/EFunTTSExckRk10SXRjUs0vBJA0paLNNNpJ22HhEz4E7e68mvcEbd+iB1goeBJmpycc2/vyaGx4Mb6/szLbWxube/kdwt7+weHR8XScdOoRDNoMCWUblNiQHAJDcutgHasgURUQIuG95nfGoM2XMm6ncTQi8iL5EPOiHVSv1jsRsSOKE2fps/poB9O+8WyX/HnwOskWJIyWqLWL3m57kCxJAJpmSDGdAI/tr2UaMuZgGmhmxiICQvJC3QclSQC00vn0af43CkDPFTaPWnxXP29kZLImElE3WQW1Kx6mfif10ns8LaXchknFiRbHBomAluFsx7wgGtgVkwcIUxzlxWzEdGEWddW4c8ZqlRoCTXuE1x36yPOQpDYSQJwVrwpuNaC1Y7WSfOyElxVLh+vy9W7ZX95dIrO0AUK0A2qogdUQw3E0Bi9ojf07n14n97M+1qM5rzlzgn6A+/7B8/rpXU=</latexit>
X
k
<latexit sha1_base64="41wdw5HrFEcTl0+Vtromz8NpuFY=">AAACG3icbVDLSgMxFM3UVx1frS7dBIvgqsyooMuiG5cV+4J2KJn0ThsmkwxJRihDP8Gtrvwad+LWhX9j+ljY1gOBw7n33HtzwpQzbTzvxylsbG5t7xR33b39g8OjUvm4pWWmKDSp5FJ1QqKBMwFNwwyHTqqAJCGHdhjfT+vtZ1CaSdEw4xSChAwFixglxkpPnX7cL1W8qjcDXif+glTQAvV+2Sn0BpJmCQhDOdG663upCXKiDKMcJm4v05ASGpMhdC0VJAEd5LNbJ/jcKgMcSWWfMHim/nXkJNF6nIS2MyFmpFdrU/G/Wjcz0W2QM5FmBgSdL4oyjo3E04/jAVNADR9bQqhi9lZMR0QRamw87tKaUMrYkFDbIbhh7SNGYxDYShzwNGnt2tT81YzWSeuy6l9VLx+vK7W7RX5FdIrO0AXy0Q2qoQdUR01E0RC9oFf05rw7H86n8zVvLTgLzwlagvP9C+ykn0w=</latexit>
e
X
k
<latexit sha1_base64="uf4ib6bEaKcYHqML6HVBo8K0Qus=">AAACKXicbVDLSgMxFM3Ud31VXboJFsFVmamCLkU3Liu0tdCWksnctmEyyZDcUcrQL3GrK7/Gnbr1R0zbWfg6EDicc0/u5YSpFBZ9/90rLS2vrK6tb5Q3t7Z3dit7+22rM8OhxbXUphMyC1IoaKFACZ3UAEtCCXdhfD3z7+7BWKFVEycp9BM2UmIoOEMnDSq7vQcRAQoZQd6ZDuJBperX/DnoXxIUpEoKNAZ7XqkXaZ4loJBLZm038FPs58yg4BKm5V5mIWU8ZiPoOqpYArafzy+f0mOnRHSojXsK6Vz9nshZYu0kCd1kwnBsf3sz8T+vm+Hwop8LlWYIii8WDTNJUdNZDTQSBjjKiSOMG+FupXzMDOPoyir/WBNqHSMLrfuENl18LHgMijpJAp31bsuuteB3R39Ju14LTmv127Pq5VXR3zo5JEfkhATknFySG9IgLcJJRh7JE3n2XrxX7837WIyWvCJzQH7A+/wC9VSlBA==</latexit>
X
k1
<latexit sha1_base64="uO/UY8jyws8cT7kZfd0RgD/faRI=">AAACH3icbVDLSgMxFM3UV62vVpdugkVwY5lRQZdFNy4r9AXtUDLpbRsmkwxJRihDP8Ktrvwad+K2f2OmnYVtPRA4nHvPvTcniDnTxnXnTmFre2d3r7hfOjg8Oj4pV07bWiaKQotKLlU3IBo4E9AyzHDoxgpIFHDoBOFTVu+8gtJMiqaZxuBHZCzYiFFirNTpDtLw2psNylW35i6AN4mXkyrK0RhUnEJ/KGkSgTCUE617nhsbPyXKMMphVuonGmJCQzKGnqWCRKD9dHHvDF9aZYhHUtknDF6ofx0pibSeRoHtjIiZ6PVaJv5X6yVm9OCnTMSJAUGXi0YJx0bi7PN4yBRQw6eWEKqYvRXTCVGEGhtRaWVNIGVoSKDtENy09gmjIQhsJQ44S1uXbGreekabpH1T825rNy931fpjnl8RnaMLdIU8dI/q6Bk1UAtRFKI39I4+nE/ny/l2fpatBSf3nKEVOPNfz6Ogyg==</latexit>
h
k
<latexit sha1_base64="ceb3YSB3y4sMk9ffibdIsEvot/g=">AAACG3icbVDLTgIxFG3xhfgCXbppJCauyIya6JLoxiVGXgkQ0ikXaKbTTtqOCSF8gltd+TXujFsX/o0dmIWAJ2lycu49996eIBbcWM/7wbmNza3tnfxuYW//4PCoWDpuGpVoBg2mhNLtgBoQXELDciugHWugUSCgFYT3ab31DNpwJet2EkMvoiPJh5xR66SncT/sF8texZuDrBM/I2WUodYv4Vx3oFgSgbRMUGM6vhfb3pRqy5mAWaGbGIgpC+kIOo5KGoHpTee3zsi5UwZkqLR70pK5+tcxpZExkyhwnRG1Y7NaS8X/ap3EDm97Uy7jxIJki0XDRBCrSPpxMuAamBUTRyjT3N1K2JhqyqyLp7C0JlAqtDQwbgipO/uYsxAkcZIAkiZtCi41fzWjddK8rPhXlcvH63L1Lssvj07RGbpAPrpBVfSAaqiBGBqhF/SK3vA7/sCf+GvRmsOZ5wQtAX//Aggjn1w=</latexit>
e
h
k
<latexit sha1_base64="FBPIg0rXt6+9YDIlj+j1FpGObws=">AAACKXicbVDLSgMxFM3UV62Ptrp0EyyCqzKjgi6LblxWaFVoS8lkbm2YTDIkd5Qy9Evc6sqvcadu/RHTx8JWDwQO59yTezlhKoVF3//0Ciura+sbxc3S1vbObrlS3bu1OjMc2lxLbe5DZkEKBW0UKOE+NcCSUMJdGF9N/LtHMFZo1cJRCr2EPSgxEJyhk/qVcvdJRIBCRpAPx/24X6n5dX8K+pcEc1IjczT7Va/QjTTPElDIJbO2E/gp9nJmUHAJ41I3s5AyHrMH6DiqWAK2l08vH9Mjp0R0oI17CulU/Z3IWWLtKAndZMJwaJe9ifif18lwcNHLhUozBMVniwaZpKjppAYaCQMc5cgRxo1wt1I+ZIZxdGWVFtaEWsfIQus+oS0XHwoeg6JOkkAnvduSay1Y7ugvuT2pB6f1k5uzWuNy3l+RHJBDckwCck4a5Jo0SZtwkpFn8kJevTfv3fvwvmajBW+e2ScL8L5/ABDjpRQ=</latexit>
U
<latexit sha1_base64="afnKBYWUxsOX1u905S/POn+A6zM=">AAAB6HicbVBNS8NAEN34WetX1aOXxSJ4KokWP25FLx5bMG2hDWWznbRrN5uwuxFK6C/w4kERr/4kb/4bN2kQtT4YeLw3w8w8P+ZMadv+tJaWV1bX1ksb5c2t7Z3dyt5+W0WJpODSiEey6xMFnAlwNdMcurEEEvocOv7kJvM7DyAVi8SdnsbghWQkWMAo0UZquYNK1a7ZOfAicQpSRQWag8pHfxjRJAShKSdK9Rw71l5KpGaUw6zcTxTEhE7ICHqGChKC8tL80Bk+NsoQB5E0JTTO1Z8TKQmVmoa+6QyJHqu/Xib+5/USHVx6KRNxokHQ+aIg4VhHOPsaD5kEqvnUEEIlM7diOiaSUG2yKechXGU4/355kbRPa85Zrd6qVxvXRRwldIiO0Aly0AVqoFvURC6iCNAjekYv1r31ZL1ab/PWJauYOUC/YL1/AcpHjRE=</latexit>
Coarser p oi nt clouds
<latexit sha1_base64="kcNNJWiyxkL+sAz3ZabguUwSR2I=">AAACBHicbVDLSgNBEJyNrxhfUY+5DAbBU9ho8HEL5uIxgnlAsoTZ2U4yZHZnmekVw5KDF3/FiwdFvPoR3vwbd5MgaixoKKq66e5yQykM2vanlVlaXlldy67nNja3tnfyu3tNoyLNocGVVLrtMgNSBNBAgRLaoQbmuxJa7qiW+q1b0Eao4AbHITg+GwSiLzjDROrlC12EO4xrimkDmoZKBEi5VJFnJr180S7ZU9BFUp6TIpmj3st/dD3FIx8C5JIZ0ynbITox0yi4hEmuGxkIGR+xAXQSGjAfjBNPn5jQw0TxaF/ppNIbUvXnRMx8Y8a+m3T6DIfmr5eK/3mdCPvnTiyCMEII+GxRP5IUFU0ToZ7QwFGOE8K4FsmtlA+ZZhyT3HLTEC5SnH6/vEiax6XySalyXSlWL+dxZEmBHJAjUiZnpEquSJ00CCf35JE8kxfrwXqyXq23WWvGms/sk1+w3r8AiAmYzQ==</latexit>
(less points)
<latexit sha1_base64="aGotdpl6HwnND0zLxYWttD36G2c=">AAAB/XicbVDJSgNBEO1xjXGLy81LYxDiJUw0uNyCXjxGMAskQ+jpVJImPT1Dd40Yh+CvePGgiFf/w5t/42QyiBofFDzeq6KqnhtIYdC2P625+YXFpeXMSnZ1bX1jM7e1XTd+qDnUuC993XSZASkU1FCghGaggXmuhIY7vJz4jVvQRvjqBkcBOB7rK9ETnGEsdXK7bYQ7jAoSjKGBLxSaw3Enl7eLdgI6S0opyZMU1U7uo931eeiBQi6ZMa2SHaATMY2CSxhn26GBgPEh60Mrpop5YJwouX5MD2KlS3u+jkshTdSfExHzjBl5btzpMRyYv95E/M9rhdg7cyKhghBB8emiXigp+nQSBe0KDRzlKCaMaxHfSvmAacYxDiybhHA+wcn3y7OkflQsHRfL1+V85SKNI0P2yD4pkBI5JRVyRaqkRji5J4/kmbxYD9aT9Wq9TVvnrHRmh/yC9f4FuOeViQ==</latexit>
(more p oi nts)
<latexit sha1_base64="pdXn93sNKsEMzYuC9B0gUsPveIM=">AAAB/XicbVDJSgNBEO1xjXGLy81LYxDiJUw0uNyCXjxGMAskIfR0KkmTnu6hu0aMQ/BXvHhQxKv/4c2/cZIMosYHBY/3qqiq5wVSWHTdT2dufmFxaTm1kl5dW9/YzGxtV60ODYcK11KbuscsSKGgggIl1AMDzPck1LzB5div3YKxQqsbHAbQ8llPia7gDGOpndltItxhlPO1ARpoodAejtqZrJt3J6CzpJCQLElQbmc+mh3NQx8UcsmsbRTcAFsRMyi4hFG6GVoIGB+wHjRiqpgPthVNrh/Rg1jp0K42cSmkE/XnRMR8a4e+F3f6DPv2rzcW//MaIXbPWpFQQYig+HRRN5QUNR1HQTvCAEc5jAnjRsS3Ut5nhnGMA0tPQjgf4+T75VlSPcoXjvPF62K2dJHEkSJ7ZJ/kSIGckhK5ImVSIZzck0fyTF6cB+fJeXXepq1zTjKzQ37Bef8CssmVhQ==</latexit>
Finer point clouds
<latexit sha1_base64="WdGEixTHkUC+7gJytiGk2g/7PDI=">AAACAnicbVDLSgMxFM3UV62vUVfiJlgEV2WqxceuKIjLCvYB7VAymUwbmpkMyR2xDMWNv+LGhSJu/Qp3/o2ZtohaDwQO59xD7j1eLLgGx/m0cnPzC4tL+eXCyura+oa9udXQMlGU1akUUrU8opngEasDB8FasWIk9ARreoOLzG/eMqW5jG5gGDM3JL2IB5wSMFLX3ukAu4P00sQVjiWPAFMhE1+PunbRKTlj4FlSnpIimqLWtT86vqRJyCKggmjdLjsxuClRwKlgo0In0SwmdEB6rG1oREKm3XR8wgjvG8XHgVTmZTtk6s9ESkKth6FnJkMCff3Xy8T/vHYCwamb8ihOgEV08lGQCAwSZ31gnytGQQwNIVRxsyumfaIIBdNaYVzCWYbj75NnSeOwVD4qVa4rxer5tI482kV76ACV0QmqoitUQ3VE0T16RM/oxXqwnqxX620ymrOmmW30C9b7F9wXl94=</latexit>
Latent spac e
<latexit sha1_base64="ZjHSvHYgOHMPBSD3MVKhnBoSaas=">AAAB/HicbVDLSsNAFJ3UV62vaJduBovgqiRafOyKbly4qGBtoQ1lMr1ph04ezEzEEOqvuHGhiFs/xJ1/4yQNotYDA4dz7r1z73EjzqSyrE+jtLC4tLxSXq2srW9sbpnbO7cyjAWFNg15KLoukcBZAG3FFIduJID4LoeOO7nI/M4dCMnC4EYlETg+GQXMY5QoLQ3Mal/BvUqviIJAYRkRCtOBWbPqVg48T+yC1FCB1sD86A9DGvt6BOVEyp5tRcpJiVCMcphW+rEEPXlCRtDTNCA+SCfNl5/ifa0MsRcK/fQKufqzIyW+lInv6kqfqLH862Xif14vVt6pk7IgivVtdPaRF3OsQpwlgYdMAFU80YRQwfSumI6JIFTpvCp5CGcZjr9Pnie3h3X7qN64btSa50UcZbSL9tABstEJaqJL1EJtRFGCHtEzejEejCfj1XiblZaMoqeKfsF4/wJDh5VO</latexit>
Figure 3: A detailed operation scheme of our latent-space
Laplacian pyramid (see the accompanying text).
We start by processing the input point cloud by a
simple upsampling operator U(·), obtaining a coarse
point cloud
e
X
k
= U(X
k1
): for each point x X
k1
we
create a new instance
e
x =
1
m
iNN(x)
x
i
where NN(x)
is a set of m nearest Euclidean neighbours of x in X
k1
(we use m = 7 neighbours, including x), and add it to
the point cloud. This procedure represents a simple
linear interpolation and forms exactly n
k
points lo-
cated in the vicinity of the real surface. However, the
computed point cloud
e
X
k
generally contains perturbed
points, and we view it only as a rough approximation
to our desired X
k
.
We map the coarse point cloud
e
X
k
by f
k
into a
latent code
e
h
k
= f
k
(
e
X
k
), that we assume is offset by
a small delta from the manifold of latent representa-
tions due to an interpolation error in
e
X
k
. To compen-
sate for this offset in the latent space, we compute
an additive correction r
k
to
e
h
k
using a generator net-
work G
k
, resulting in a corrected code h
k
=
e
h
k
+ r
k
=
e
h
k
+ G
k
(
e
h
k
,z
k
). Decoding h
k
by g
k
, we obtain a re-
fined point cloud X
k
= g
k
(h
k
) with resolution n
k
.
Putting together the full procedure in the space of
latent representations leads to a series of relations
h
k
= f
k
(U(X
k1
)) + G
k
( f
k
(U(X
k1
)),z
k
), (2)
which is a latent-space equivalent of (1). Hence, we
call the resulting series {h
k
}
K
k=0
of hierarchical repre-
sentations a latent-space Laplacian pyramid (LSLP).
Training Latent GANs on Multiple Scales. To
perform meaningful corrections of the rough latent
code
e
h
k
, each generator G
k
faces a challenge of learn-
ing the subtle differences between the latent codes
of true high-resolution 3D point clouds and those of
coarsely upsampled ones. Thus, we train a series of
latent GANs {(G
k
,D
k
)}
K
k=1
by forcing the generator
G
k
to synthesise residuals r
k
in the latent space con-
ditioned on the input
e
h
k
, and the discriminator D
k
to distinguish between the real latent codes h
k
R
d
k
and the synthetic ones
e
h
k
+ G
k
(
e
h
k
,z
k
). Note that as
each (but the first) latent GAN accepts a rough latent
code
e
h
k
, they may be viewed as conditional GANs
(CGANs) (Mirza and Osindero, 2014).
Two Execution Modes: Synthesis and Upsampling.
In the text above, we assumed an initialiser X
0
R
n
0
×3
to be given as an input, which is the case in
particular applications, such as upsampling or shape
completion. However, our framework can as easily
function in a purely generative mode, sampling un-
seen high-resolution point clouds on the fly. To en-
able this, we start with an (unconditional) latent GAN
G
0
and produce a point cloud X
0
= g
0
(G
0
(z
0
)), which
serves as an input to the remaining procedure.
An overview of our architecture is presented
in Figure 2.
Architectural and Training Details of our Frame-
work. The architecture of all our networks is based
on the one proposed in (Achlioptas et al., 2018),
where the autoencoders follow the PointNet (Qi et al.,
2017a) encoders design and have fully-connected de-
coders, and GANs are implemented by the MLPs.
When training the autoencoders, we optimise the
Earth Mover’s Distance (EMD) given by:
d
EMD
(X,Y ) = min
φ:XY
xX
||x φ(x)||
2
where φ is a bijection, obtained as a solution to the op-
timal transportation problem involving the two point
sets X,Y R
n
k
×3
. Training the GANs is performed
by optimising the commonly used objectives (Good-
fellow et al., 2014; Mirza and Osindero, 2014).
VISAPP 2020 - 15th International Conference on Computer Vision Theory and Applications
424
4 EVALUATION AND
APPLICATIONS
4.1 Setup of our Evaluation
Training Datasets. For all our experiments, we use
the meshes from the ShapeNet dataset (Chang et al.,
2015). We have perfomed experiments using sepa-
rate airplane, table, chair, car, and sofa classes in the
Shapenet dataset, as well as using a multi-class setup.
We train three stages of autoencoders and generative
models on resolutions of n
0
= 512, n
1
= 1024, and
n
2
= 2048 points, respectively, using a training set of
3046, 7509, 5778, 6497, and 4348 3D shapes for the
classes airplane, table, chair, car, and sofa, respec-
tively, and 9000 shapes for our multi-class setup. We
have used Adam (Kingma and Ba, 2014) optimisers to
train both autoencoders and GANs. All autoencoders
have been trained for 500 epochs with the initial learn-
ing rates of 5 ×10
4
, β
1
= 0.9 and a batch size of 50;
GANs have been trained for 200 epochs with the ini-
tial learning rates of 10
4
, β
1
= 0.9 and a batch size
of 50.
Metrics. Along with the Earth Mover’s Distance
(EMD), we assess the point cloud reconstruction per-
formance using the Chamfer Distance (CD) as a sec-
ond commonly adopted measure, given by
d
CD
(X,Y ) =
xX
min
yY
||x y||
2
2
+
yY
min
xX
||x y||
2
2
.
To evaluate the generative models, we employ the
Jensen-Shannon Divergence (JSD), coverage (COV),
and Minimum Matching Distance (MMD) measures,
following the scheme proposed in (Achlioptas et al.,
2018). JSD is defined over two empirical mea-
sures P and Q as JSD(P k Q) = 1/2KL(P k M) +
1/2KL(Q k M) where KL(P k Q) is the Kullback-
Leibler divergence between the measures P and Q,
M = 1/2(P + Q), and measures P and Q count the
number of points lying within each voxel in a voxel
grid across all point clouds in the two sets A and B, re-
spectively. COV measures a fraction of point clouds
in B approximately represented in A; to compute it,
we find the nearest neighbours in B for each x A.
MMD reflects the fidelity of the set A with respect to
the set B, matching every point cloud of B to the one
in A with the minimum distance and computing the
average of distances in the matching.
4.2 Experimental Results
Evaluating Autoencoders. We first evaluate our
point autoencoders to validate their ability to compute
efficient latent representations with increasing resolu-
tions of input 3D point clouds. To compute the recon-
structions, we encode into the latent space and decode
back the 3D shapes from the test split of the respec-
tive class unseen during training. In Table 1 we dis-
play the reconstruction quality of our auto-encoders
for the three levels of resolution, using CD and EMD
measures. As the sampling density increases, both
measures improve as expected.
Input
Reconstruction
512 1024 2048
Figure 4: Inputs and reconstructions using our autoencoders
at resolutions n
i
{512, 1024,2048} of the 3D point cloud.
Table 1: Reconstruction quality using our autoencoders at
resolutions n
i
{512,1024, 2048} of the 3D point cloud.
Shape
class
CD ×10
3
EMD ×10
3
512 1024 2048 512 1024 2048
chair 0.16 0.10 0.07 60.2 53.5 48.3
airplane 0.57 0.38 0.29 39.4 34.5 30.8
table 1.41 0.96 0.67 56.9 50.1 45.6
Figure 4 demonstrates the ground-truth and de-
coded 3D point clouds, respectively, at all stages in
our autoencoders. We conclude that our models can
represent the 3D point clouds at multiple resolutions.
Evaluating Generative Models. We further eval-
uate our generative models using the MMD-CD,
COV, and JSD measures, in both single-class and
multi-class setups. To this end, we train our LSLP-
GAN using the latent spaces obtained with the pre-
viously trained autoencoders. Table 2 compares our
LSLP-GAN and the L-GAN model (Achlioptas et al.,
2018). We consistently outperform the baseline L-
GAN across all object classes according to the quality
metrics defined above.
Latent-space Laplacian Pyramids for Adversarial Representation Learning with 3D Point Clouds
425
51210242048
Single class
Multi-class
Figure 5: Examples of shapes synthesised using our LSLP-GAN model. Left: airplanes, chairs, and tables synthesised using
our single-class models. Right: samples of 3D shapes synthesised using our multi-class model, note that the overall geometry
of the shape changes slightly due to averaging over many classes. The rightmost figure displays a failure mode for our model.
Table 2: Performance evaluation of our proposed
LSLP-GAN model as compared to the baseline L-GAN
model (Achlioptas et al., 2018).
Shape
class
MMD-CD ×10
3
COV-CD, % JSD ×10
3
L-GAN Ours L-GAN Ours L-GAN Ours
car 0.81 0.71 23.5 32.1 28.9 24.2
chair 1.79 1.71 44.9 47.8 13.0 10.1
sofa 1.26 1.23 43.9 46.3 9.6 9.3
table 1.93 1.77 39.7 47.8 19.9 10.1
airplane 0.53 0.51 41.7 44.0 17.1 13.8
multiclass 1.66 1.55 41.4 45.7 14.3 9.8
To demonstrate examples of novel 3D shape syn-
theses using our framework, we sample z
0
and pro-
cess it with our framework, obtaining 3D point clouds
X
0
,X
1
,X
2
, which we display in Figure 5. Our frame-
work can synthesise increasingly detailed 3D shapes,
gradually adding resolution using a series of genera-
tive models.
Point Cloud Upsampling. Generative models such
as L-GAN and our proposed LSLP-GAN are a nat-
ural fit for the task of 3D point set upsampling, as
they learn to generate novel points given the lower
resolution inputs. Thus, we model an upsampling
task using the low-resolution 3D shapes from the
Shapenet dataset. We supply LSLP-GAN with a low-
resolution point cloud from the test split of the multi-
class dataset and increase its resolution four-fold from
Input, 512
1024 2048
Reference
Figure 6: 3D point clouds upsampling results using our
model, initialised with the input shape.
n
0
= 512 to n
2
= 2048 points, performing conditional
generation using G
1
and G
2
. Figure 6 displays 3D
shapes upsampled using our multi-class model. Note
that model has not been trained to perform upsam-
pling directly, i.e. to preserve the overall shape geom-
etry when producing novel points, hence the subtle
changes in 3D shapes as the upsampling progresses.
5 CONCLUSION AND FUTURE
WORK
We have presented LSLP-GAN, a novel deep adver-
sarial representation learning framework for 3D point
VISAPP 2020 - 15th International Conference on Computer Vision Theory and Applications
426
clouds. The initial experimental evaluation reveals the
promising properties of our proposed model. How-
ever, further investigation into the multi-scale gen-
erative learning methods is needed, including adop-
tion of more recent deep architectures (Zhang et al.,
2019; Uy et al., 2019). Other promising directions
include exploring the limits of the Laplacian pyra-
mid representation and a more extensive experimental
evaluation of our approach. To this end, we plan to
(i) further extend our work, considering deeper pyra-
mid levels and larger upsampling factors (e.g. ×64),
and (ii) conduct a comparative investigation of our
framwork using more challenging tasks such as shape
completion, using deep learning methods, e.g.,. (Yu
et al., 2018b; Yifan et al., 2019; Yu et al., 2018a;
Li et al., 2019; Mandikal and Radhakrishnan, 2019;
Chen et al., 2019).
ACKNOWLEDGEMENT
The work of Youyi Zheng is partially supported
by the National Key Research & Development Pro-
gram of China (2018YFE0100900). The work of
Vage Egiazarian, Alexey Artemov, Oleg Voynov
and Evgeny Burnaev is supported by The Min-
istry of Education and Science of Russian Fed-
eration, grant No. 14.615.21.0004, grant code:
RFMEFI61518X0004. The work of Luiz Velho is
supported by CNPq/MCTIC/BRICS-STI No 29/2017
Grant No: 442032/2017-0. The authors Vage
Egiazarian, Alexey Artemov, Oleg Voynov, and
Evgeny Burnaev acknowledge the usage of the
Skoltech CDISE HPC cluster Zhores for obtaining re-
sults presented in this paper.
REFERENCES
Achlioptas, P., Diamanti, O., Mitliagkas, I., and Guibas, L.
(2018). Learning representations and generative mod-
els for 3d point clouds. In ICML, pages 40–49.
Atzmon, M., Maron, H., and Lipman, Y. (2018). Point
convolutional neural networks by extension operators.
ACM ToG, 37(4):71.
Brock, A., Lim, T., Ritchie, J. M., and Weston, N.
(2016). Generative and discriminative voxel modeling
with convolutional neural networks. arXiv preprint
arXiv:1608.04236.
Chang, A. X., Funkhouser, T., Guibas, L., Hanrahan,
P., Huang, Q., Li, Z., Savarese, S., Savva, M.,
Song, S., Su, H., et al. (2015). Shapenet: An
information-rich 3d model repository. arXiv preprint
arXiv:1512.03012.
Chen, X., Chen, B., and Mitra, N. J. (2019). Unpaired point
cloud completion on real scans using adversarial train-
ing. arXiv preprint arXiv:1904.00069.
Denton, E. L., Chintala, S., Fergus, R., et al. (2015). Deep
generative image models using a laplacian pyramid of
adversarial networks. In NIPS, pages 1486–1494.
Fan, H., Su, H., and Guibas, L. J. (2017). A point set gen-
eration network for 3d object reconstruction from a
single image. In CVPR, pages 605–613.
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B.,
Warde-Farley, D., Ozair, S., Courville, A., and Ben-
gio, Y. (2014). Generative adversarial nets. In NIPS,
pages 2672–2680.
Hua, B.-S., Tran, M.-K., and Yeung, S.-K. (2018). Point-
wise convolutional neural networks. In CVPR, pages
984–993.
Kingma, D. P. and Ba, J. (2014). Adam: A
method for stochastic optimization. arXiv preprint
arXiv:1412.6980.
Kingma, D. P. and Welling, M. (2013). Auto-encoding vari-
ational bayes. arXiv preprint arXiv:1312.6114.
Li, C.-L., Zaheer, M., Zhang, Y., Poczos, B., and Salakhut-
dinov, R. (2018a). Point cloud gan. arXiv preprint
arXiv:1810.05795.
Li, J., Chen, B. M., and Hee Lee, G. (2018b). So-net: Self-
organizing network for point cloud analysis. In CVPR,
pages 9397–9406.
Li, R., Li, X., Fu, C.-W., Cohen-Or, D., and Heng, P.-A.
(2019). Pu-gan: a point cloud upsampling adversarial
network. In CVPR, pages 7203–7212.
Li, Y., Bu, R., Sun, M., Wu, W., Di, X., and Chen, B.
(2018c). Pointcnn: Convolution on x-transformed
points. In NIPS, pages 820–830.
Lin, C.-H., Kong, C., and Lucey, S. (2018). Learning ef-
ficient point cloud generation for dense 3d object re-
construction. In AAAI.
Mandikal, P. and Radhakrishnan, V. B. (2019). Dense 3d
point cloud reconstruction using a deep pyramid net-
work. In WACV, pages 1052–1060. IEEE.
Mirza, M. and Osindero, S. (2014). Conditional generative
adversarial nets. arXiv preprint arXiv:1411.1784.
Nash, C. and Williams, C. K. (2017). The shape varia-
tional autoencoder: A deep generative model of part-
segmented 3d objects. In Computer Graphics Forum,
volume 36, pages 1–12. Wiley Online Library.
Qi, C. R., Su, H., Mo, K., and Guibas, L. J. (2017a). Point-
net: Deep learning on point sets for 3d classification
and segmentation. In CVPR, pages 652–660.
Qi, C. R., Yi, L., Su, H., and Guibas, L. J. (2017b). Point-
net++: Deep hierarchical feature learning on point sets
in a metric space. In NIPS, pages 5099–5108.
Uy, M. A., Pham, Q.-H., Hua, B.-S., Nguyen, T., and Ye-
ung, S.-K. (2019). Revisiting point cloud classifi-
cation: A new benchmark dataset and classification
model on real-world data. In CVPR, pages 1588–
1597.
Wang, Y., Sun, Y., Liu, Z., Sarma, S. E., Bronstein, M. M.,
and Solomon, J. M. (2018). Dynamic graph cnn for
learning on point clouds. CoRR, abs/1801.07829.
Latent-space Laplacian Pyramids for Adversarial Representation Learning with 3D Point Clouds
427
Wu, J., Zhang, C., Xue, T., Freeman, B., and Tenenbaum,
J. (2016). Learning a probabilistic latent space of ob-
ject shapes via 3d generative-adversarial modeling. In
NIPS, pages 82–90.
Yang, Y., Feng, C., Shen, Y., and Tian, D. (2018). Fold-
ingnet: Point cloud auto-encoder via deep grid defor-
mation. In CVPR, pages 206–215.
Yifan, W., Wu, S., Huang, H., Cohen-Or, D., and Sorkine-
Hornung, O. (2019). Patch-based progressive 3d point
set upsampling. In CVPR, pages 5958–5967.
Yu, L., Li, X., Fu, C.-W., Cohen-Or, D., and Heng, P.-A.
(2018a). Ec-net: an edge-aware point set consolida-
tion network. In ECCV, pages 386–402.
Yu, L., Li, X., Fu, C.-W., Cohen-Or, D., and Heng, P.-A.
(2018b). Pu-net: Point cloud upsampling network. In
CVPR, pages 2790–2799.
Zhang, Z., Hua, B.-S., and Yeung, S.-K. (2019). Shell-
net: Efficient point cloud convolutional neural net-
works using concentric shells statistics. In CVPR,
pages 1607–1616.
VISAPP 2020 - 15th International Conference on Computer Vision Theory and Applications
428