2 RELATED WORK 
Generating data for training different models of 
convolutional neural networks is a rather actual topic. 
Therefore, various teams of researchers are 
developing algorithms for creating synthetic datasets. 
The topic of creating synthetic data is discussed in 
some resources: 
a) The paper which considers the benefits of 
synthetic data generation for СNN training (The 
Ultimate Guide to Synthetic Data in 2020); 
b) The research on using ray tracers to create 
training databases (John B. McCormac, 2018).  
There are some tools which are able to make 
synthetic data for СNN learning. 
a) A simple GUI-based COCO-style JSON 
Polygon masks' annotation tool to facilitate the quick 
and efficient crowd-sourced generation of annotation 
masks and bounding boxes. Optionally, one could 
choose to use a pre-trained Mask RCNN model to 
come up with initial segmentations. This tool could 
be used for hand-made annotation of existing images 
(Hans Krupakar, 2018). 
b) This project is a development of the project 
mentioned in the previous paragraph, the 
development of this tool is continued by the team of 
programmers, who are interested in this field. The 
original functionality has been saved and refined 
(Hans Krupakar, 2018). 
However, a tool that could create high-quality 
annotated sets of multiple overlapped objects has not 
been implemented yet. 
c) Nvidia Deep learning Dataset Synthesizer 
(NDDS) a UE4 plugin from Nvidia (J. Tremblay, T. 
To, A. Molchanov, S. Tyree, J. Kautz, S. Birchfield, 
2018) (J. Tremblay, T. To, S. Birchfield, 2018) to 
empower computer vision researchers to export high-
quality synthetic images with metadata. NDDS 
supports images, segmentation, depth, object pose, 
bounding box, key points, and custom stencils. In 
addition to the exporter, the plugin includes different 
components for generating highly randomized 
images. This randomization includes lighting, 
objects, camera position, poses, textures, and 
distractors, as well as camera path following, etc. 
Together, these components make it possible for 
researchers to easily create randomized scenes for 
training deep neural networks. 
The strong features of the Nvidia tool are: 
  Ability of using a physical engine; 
  Flexibility of GUI-based basic scene 
configuring; 
  Possibility of using colored meshes and RGB-D 
point clouds. 
The main weak features of the Nvidia tool are as 
follows: 
  UE4 dependence ( CUDA and graphics need); 
  Batch mode is problematic; 
  External scene configuration is complicated for 
realization. 
3 PROPOSED METHOD 
We would like to present an approach and the tool, 
which is able to generate a synthetic dataset for a 
batch of mesh- defined objects in an automatic mode 
based on ray-tracing.
 
Ray-tracing is a process of modelling the real 
physical process of light reflection and consumption. 
The approach allows us to generate realistic images 
and therefore could be able to present high-quality 
training datasets based on artificial images only.  
 
This tool is based on POV-Ray physical core 
(POV-Ray – The Persistence of Vision Raytracer). 
The main target of the current project is developing a 
python based tool for making artificial images from 
mesh models which could be easily implemented into 
a self-learning process. All instances should be easily 
configured by using text-based config files. The tool 
could be used without long packet dependencies. 
 
3.1  Process of Image Creation 
Images are generated by using the Ray Tracer — 
POV-Ray. The Persistence of Vision Raytracer 
(POV-Ray: Download) is a high-quality, Free 
Software tool for creating stunning three-dimensional 
graphics (POV-Ray: Hall of Fame). The source code 
is available for those wanting to do their own 
research. 
 
Creating realistic images significantly depends on 
the configuration of the lighting sources. 
Image generation uses one primary point white 
light source for general lighting: RGB intensity = 
(1.0,1.0,1.0). The light source is used with a common 
brightness factor 1.0, and also has the ability to set the 
rotation angle relative to the camera located at a 
distance equal to the removal of the camera and four 
additional fixed spot light sources of low intensity 
(intensity - [0.4.0.4.0.4]) spaced from the Z-axis by 
angles (75,0,0) (-75,0,0) (0,75,0) (0, -75.0) without 
the possibility of changing its position from the 
configuration file. The primary light illuminates the 
geometry of the part, highlighting its features. 
Additional lighting sources compensate for "rigidity" 
and provide backlight for shaded areas of details. 
KMIS 2020 - 12th International Conference on Knowledge Management and Information Systems