Authors:
Ivan Jarsky
;
Maxim Kuzin
;
Valeria Efimova
;
Viacheslav Shalamov
and
Andrey Filchenkov
Affiliation:
ITMO University, Kronverksky Pr. 49, St. Petersburg, Russia
Keyword(s):
Vector Graphics, Image Generation, Diffusion Models, Transformer.
Abstract:
Diffusion models generate realistic results for raster images. However, vector image generation is not so successful because of significant differences in image structure. Unlike raster images, vector ones consist of paths that are described by their coordinates, colors, and stroke widths. The number of paths needed to be generated is unknown in advance. We tackle the vector image synthesis problem by developing a new diffusion-based model architecture, that we call VectorWeaver, including two transformer-based stacked encoders and two transformer-based stacked decoders. For training the model, we collected a vector images dataset from public resources, however, its size was not enough. To enrich and enlarge it we proposed new augmentation operations specific for vector images. To train the model, we designed a specific loss function, which allowed the generation of objects with smooth contours without artifacts. Qualitative experiments demonstrate the superiority and computational e
fficiency of the proposed model compared to the existing vector image generation methods. The vector image generation code is available at https://github.com/CTLab-ITMO/VGLib/tree/main/VectorWeaver.
(More)