Fruit Defect Detection Using CNN Models with Real and Virtual Data
Renzo Pacheco
1 a
, Paula Gonz
´
alez
1 b
, Luis E. Chuquimarca
1,2 c
, Boris X. Vintimilla
1 d
and Sergio A. Velastin
3,4 e
1
ESPOL Polytechnic University, ESPOL, CIDIS, Guayaquil, Ecuador
2
UPSE Santa Elena Peninsula State University, UPSE, FACSISTEL, La Libertad, Ecuador
3
Queen Mary University of London, London, U.K.
4
University Carlos III, Madrid, Spain
Keywords:
Fruit Defects, Convolutional Neural Networks, Real and Virtual Data.
Abstract:
The present study seeks to evaluate different CNN models in order to compare their performance in recogniz-
ing a range of defects in apples and mangoes to ensure the quality of the production of these foods. Using
the CNN models, InceptionV3, MobileNetV2, VGG16 and DenseNet121, which were trained with a dataset
of real and synthetic images of apples and mangoes covering fruit in acceptable quality condition and with
defects: rot, bruises, scabs and black spots. Training was performed with variations on the hyper-parameters
and the metric is accuracy. The MobileNetV2 model achieved the highest accuracy in training and testing,
obtaining 97.50% for apples and 92.50% for mangoes, making it the most suitable model for defect detection
in these fruits. The InceptionV3 and DenseNet121 models presented accuracy values above 90%, while the
VGG16 model obtained the poorest performance by not exceeding 80% accuracy for any of the fruits. The
trained models, especially MobileNetV2, are capable of recognizing a range of defects in the fruits under study
with a high degree of accuracy and are suitable for use in the development of automation applications for the
quality assessment process of apples and mangoes.
1 INTRODUCTION
The food industry is subject to strict quality standards
that govern all parts of the food production process. In
addition, ensuring the quality of food is of the utmost
importance for health. Therefore, the production of
high-quality food implies that the products meet ac-
ceptable characteristics (color, odor, texture, shape,
and defects) for inspectors and consumers. Accord-
ing to the Food and Agriculture Organization (FAO),
in Latin American countries between 10% and 20% of
harvested fruits and vegetables are discarded for var-
ious reasons, including non-compliance with quality
standards (Munesue et al., 2015).
The early identification of defects in fruits is im-
portant to ensure the quality of these foods, to main-
tain their nutritional value and the satisfaction of the
final consumer, and to avoid financial losses for pro-
a
https://orcid.org/0000-0001-8162-4533
b
https://orcid.org/0000-0003-1302-9550
c
https://orcid.org/0000-0003-3296-4309
d
https://orcid.org/0000-0001-8904-0209
e
https://orcid.org/0000-0001-6775-7137
ducers. Currently, manual processes are used to in-
spect the quality of fruits. These techniques are slow,
imprecise and give room for the appearance of defects
that cause the fruit to be rejected in quality controls or
by consumers. To solve this problem, machine learn-
ing techniques have been studied to assess fruit qual-
ity more quickly and accurately.
The present work seeks to perform an evaluation
of CNN models for the detection of defects in apples
and mangoes. The evaluation is performed using a
set of real and synthetic images as input data to train
and validate the models. The defects to be identified
are: rot, bruises, scabs and black spots, caused by in-
sects, diseases, climatic conditions and post-harvest
handling.
This article is organized as follows: Section 2 con-
tains a review of fruit defect identification. Section 3
describes the proposed methodology to develop the
work. Section 4 presents the results of the identifi-
cation of apples and mangoes with defects using ex-
isting CNN models in the state of the art. Finally,
conclusions are given in Section 5.
272
Pacheco, R., González, P., Chuquimarca, L., Vintimilla, B. and Velastin, S.
Fruit Defect Detection Using CNN Models with Real and Virtual Data.
DOI: 10.5220/0011679800003417
In Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2023) - Volume 4: VISAPP, pages
272-279
ISBN: 978-989-758-634-7; ISSN: 2184-4321
Copyright
c
2023 by SCITEPRESS Science and Technology Publications, Lda. Under CC license (CC BY-NC-ND 4.0)
2 LITERATURE REVIEW
This section defines the different types of defects con-
sidered in this study and their corresponding causes.
In addition, current research and advances in defect
detection techniques applied to apples and mangoes
are outlined.
2.1 Fruit and Vegetable Defects
Within the food industry there are several quality con-
trol processes for fruits and vegetables that determine
the quality of the food, based on parameters such as:
color, odor, flavor, morphology and the presence of
defects (Zhang et al., 2014). Each of these aspects is
analyzed in detail to identify critical points in the ap-
pearance of the fruit or vegetable, and thus determine
whether it is prone to deteriorate or lose its nutritional
value before reaching the final consumer.
To determine what type of defect is associated
with a fruit or vegetable, imperfections need to be
grouped into specific categories that define a possi-
ble cause of food defects. The categories that divide
these defects, depending on harvesting practices, sub-
sequent handling and storage conditions according to
(Nturambirwe and Opara, 2020):
Pathological Disorders. These defects are
caused by pathogenic microorganisms such as
viruses, bacteria, and fungi. The presence of these
pathogenic entities mainly accelerates the ripen-
ing of the fruit until it rots (Nturambirwe and
Opara, 2020).
Mechanical Damage. Environmental factors and
post-harvest handling are the main causes of this
type of damage. It manifests as variations in the
external coloration of the fruit and lesions, these
conditions in turn accelerate ripening and facili-
tate the development of infections (Nturambirwe
and Opara, 2020).
Physiological Disorders. These disorders man-
ifest as changes in fruit flavor, texture and color
and are caused by poor plant nutrition and unsuit-
able temperatures during fruit development (Ntu-
rambirwe and Opara, 2020).
Morphological Disorders. These damages re-
fer to the presence of changes in the normal ap-
pearance and shape of the fruit. This type of de-
fects do not usually entail an alteration of the in-
trinsic qualities of the fruit such as its chemical
composition, nutritional value, odor, color, how-
ever, they impair its aesthetics (Nturambirwe and
Opara, 2020).
Internal Defects. These defects are those that are
not apparent to the naked eye and may be pre-
cursors of other types of disorders of physiolog-
ical, morphological or mechanical damage (Ntu-
rambirwe and Opara, 2020).
2.2 Apple and Mango Defects
For the purpose of this project, defects in apples and
mangoes, which can be analyzed visually and are the
most common in quality control tasks for these fruits,
are explored (see Figure 1). The following defects are
considered:
Figure 1: DALL-E mini generated synthetic images of ap-
ple and mango defects: (a) fresh apple, (b) rotten apple, (c)
bruised apple, (d) scabbed apple, (e) fresh mango, (f) rotten
mango, (g) bruised mango, (h) mango with black spots.
Rot. This defect originates from various factors
such as physical damage caused by the environ-
ment, insects and also bacterial and fungal in-
fections according to (Enicks et al., 2020). This
defect is observable to the naked eye and causes
changes in the coloration and texture of the fruit
skin, the skin turns brown, darkens progressively
and loses firmness. When appearing at the post-
harvest stage, the infections that cause this defect
can continue to spread slowly even during storage
under ideal conditions.
Bruises. According to (Nguyen et al., 2022)
bruises on fruits such as mangoes result from im-
proper handling during handling and packing. In
general damage caused by bruising does not go
through the skin, but changes the coloration and
texture of the affected part. This is a type of de-
fect considered mechanical damage.
Scabs. Scabbing is one of the most common con-
ditions on apples. This type of defect is caused by
fungal infections on both leaves and fruit, as in-
dicated by (Koetter et al., 2019). Unlike a bruise,
which does not affect the nutritional and chemical
properties of the fruit, scab makes the fruit inedi-
ble. This defect is manifested by cracked, dry and
dark colored skin in the affected area.
Fruit Defect Detection Using CNN Models with Real and Virtual Data
273
Black Spots. According to (Crane and Gazis,
2020) black spots appear on mango as a symptom
of infection by bacteria such as Xanthomonas ax-
onopodis pv. mangiferaeindicae. The symptoms
of infection include the appearance of star-shaped
lesions that expand and darken over time. Even
the slightest degree of infection causes the fruit to
lose its quality and sales potential.
The defects selected for the study, were chosen
based on the availability of data for training the CNN
models with publicly available databases and the re-
view of other studies where these defects are also sub-
ject to analysis. Rot and bruising were determined as
defects to be analyzed in common for both fruits. In
addition, two defects that occur more frequently in
each fruit, scab in apples and black spots in mangoes,
were selected.
2.3 Fruit Defect Detection Techniques
The grading and inspection of fruit quality is carried
out with the purpose of providing the consumer with
a product free of externally perceptible physical de-
fects and with the assurance that the chemical prop-
erties and nutritional value of the product have not
been altered. The main external characteristics evalu-
ated to assign a quality grade to fruits and vegetables
are color, texture, size, shape and the presence of de-
fects. These indicators are key for the identification of
conditions such as diseases that can contaminate en-
tire productions as suggested by (Omar and MatJafri,
2013).
Research by (Fu et al., 2022) indicates that qual-
ity control of fruits and vegetables is commonly per-
formed manually, with trained personnel identifying
defects. However, human intervention in this process
makes it susceptible to failures either by omission,
ambiguous grading criteria, and the number of de-
fects that personnel must be able to recognize. (Omar
and MatJafri, 2013) argue that defects such as bruis-
ing and rot do not always manifest themselves to a
degree that allows differentiation from a healthy area
of fruit, which is a challenge for computational evalu-
ation techniques that use criteria such as color, texture
and size to perform the analysis.
The use of CNN models is one of the most studied
methods for fruit classification and defect detection.
(Wu et al., 2020) describe this type of neural networks
that implement convolution layers where matrix oper-
ations are performed between the input image matrix
and a smaller matrix called kernel, to extract features
from the input images. These networks also employ
pooling layers for the purpose of reducing the size of
the output of the convolution layers. Some CNN mod-
els apply regularization processes to avoid overfitting,
e.g., dropout. Finally, CNN networks perform classi-
fication by means of a fully connected layer that re-
ceives a one-dimensional or flattened vector as output
from the convolution layers.
The study of (Pathak, 2021) proposes a fruit de-
fect detection method based on CNN models. They
used a proprietary model and the pre-trained models
AlexNet, Le-Net-5, VGG16 and VGG19, to detect rot
in apples, bananas and oranges. The results generated
indicate that this method is effective in the classifi-
cation of defective and non-defective fruits achieving
an accuracy of 98.23% with their proposed model and
90.81% with VGG16, the model with second best per-
formance. In that study it was found that the trans-
fer learning methods were less effective than the pro-
posed model, however, the training of the models was
performed with only 4035 apple images for the two
categories ”fresh” (1693) and ”rotten” (2342).
According to (Wu et al., 2020), CNN models can
be used for the detection of fruit defects at the pre-
harvest and post-harvest stages. In the case of ap-
ples, one of the subject fruits of this study, using
the AlexNet model with 11 layers, an accuracy in
the identification of defective fruits of 92.5% was
achieved having hyperspectral images. In the study
of (Fu et al., 2022), rot detection in apples was also
performed with RGB images as input obtaining the
lowest Mean Squared Error (MSE) and standard de-
viation with the AlexNet model, followed by VGG11.
For defect detection in mangoes the MobileNetV2
network model achieved an average accuracy of
73.33% in the research of (Zheng and Huang, 2021).
The defects detected for the assignment of a quality
grade were the presence of black spots and blemishes
on the outside of the fruit. In apples, bruising can be
evidenced by spots on the outside of the fruit and in
the study of (Arango et al., 2021) the DenseNet121
model is the most effective with an MCC of 0.978%,
performing the analysis on a dataset of RGB images
and compared with AlexNet, ResNet-18, ResNet-50
and VGG19 models.
In the experiments of (Miah et al., 2021), one of
the most promising results for identifying rot-related
defects in both apples and mangoes was obtained,
reaching an accuracy of 97.34%, these being the high-
est metrics achieved with the InceptionV3 model.
Table 1 details the results of the state of the art
review for fruit defect detection using CNN models,
recording the models, metrics and data used in these
studies.
The studies presented differ in the size of their
databases, the partitioning of the data for training and
validation, the hyper-parameters selected, and the de-
VISAPP 2023 - 18th International Conference on Computer Vision Theory and Applications
274
Table 1: Comparison of CNN models performance for defect detection on apples and mangoes.
Fruit Defect CNN Model Metric Performance
Apple (Karakaya et al., 2019) Rot ResNet-50 Accuracy 97.00%
Apple/
Mango (Miah et al., 2021)
Rot
InceptionV3
Xception
VGG16
MobileNet
NASNetMobile
Accuracy
97.34%
97.16%
96.54%
95.57%
75.29%
Apple (Pathak, 2021) Rot
VGG16
VGG19
LeNet-5
AlexNet
Accuracy
90.81%
76.48%
82.93%
83.56%
Apple (Fu et al., 2022) Rot
GoogleNet
VGG11
AlexNet
ResNet
MSE
4.404
3.934
4.099
4.058
Apple (Arango et al., 2021) Bruises
VGG19
RestNet-50
DenseNet121
MCC
0.969
0.970
0.978
Mango (Zheng and Huang, 2021) Black spots
SqueezeNet
ResNet-50
Enhanced
MobileNetV2
Accuracy
96.94%
86.67%
96.67%
73.33%
Apple (Khan et al., 2021) Scabs VGG16 Accuracy 87.50%
fects they cover in their input images. These studies
limit the databases to less than 4000 images on aver-
age and less than 2000 images per category. On the
other hand, the defects covered in the mentioned stud-
ies are limited to categorizing the input images ac-
cording to the presence or not of defects, which may
influence the ability of the models to detect certain
types of defects with less presence in the dataset.
In the following section, the methodology used for
data collection, training and evaluation of the mod-
els is detailed. The selected databases and the pro-
cess of refinement and data augmentation are pre-
sented. Likewise, the generation of a synthetic image
database is detailed. In addition, the architecture of
the selected models and the training process with the
adjusted hyper-parameters will be presented.
3 PROPOSED METHODOLOGY
The following section describes the methodology
used to obtain the dataset and the training of the
CNN models selected for the identification of defec-
tive fruits. On the other hand, the aspects of the pro-
cess carried out to reach the necessary amount of real
and synthetic images of apples and mangoes are de-
tailed. In addition, a brief description of each model
chosen for the training and subsequent evaluation is
shown.
3.1 Image Acquisition Techniques
There are several image acquisition methods avail-
able depending on the characteristics and defects of
the fruit to be analyzed.
To capture real images, cameras in the RGB do-
main are used to explore surface defects that manifest
themselves in changes of color and texture of the fruit
skin, as in the study by (V G and Pinto, 2021). It is
also possible to use hyperspectral cameras as used in
(Wu et al., 2020) where images were obtained to de-
tect scabs, bruises, rotting and other defects in apples.
In the case of synthetic images, some techniques
can be highlighted to create this type of image. The
generation of synthetic datasets is not a new prac-
tice, in (Charco et al., 2021) 3000 images generated
in a virtual environment with the CARLA software
were used for the estimation of camera poses using
the CNN ResNet-50 architecture with modifications.
Also, by means of Ray Tracing it is possible to recre-
ate objects in virtual environments using the proper-
ties of light rays.
The proposal of (Kolker et al., ) consists of model-
ing objects in 3D to use software to create a realistic
visualization of the object after performing calcula-
tions based on the behavior of light directed at it from
different angles and sources. This technique pursues
this photorealism using natural laws of light propaga-
tion, reflection, and refraction, which depend on the
materials.
Fruit Defect Detection Using CNN Models with Real and Virtual Data
275
(Plowman, 2016) indicates that other methods for
generating synthetic images include modeling and
rendering the objects in tools such as Blender and Un-
real Engine, allowing greater control over the desired
characteristics and conditions in the dataset. Since 3D
modeling from scratch can be a slow and expensive
process, an alternative for generating synthetic data is
the capture of 3D models using LiDAR sensors and
applications that use the data from these sensors to
reconstruct objects in a virtual environment.
Finally, there is the possibility of using recently
published tools to generate synthetic images us-
ing machine learning systems such as DALL-E and
DALL-E mini models. These types of systems are
capable of generating high-resolution images (in the
case of DALL-E) from input text describing the de-
sired conditions and details of objects or scenes, (Han
et al., 2022).
In this research work, the dataset generation was
performed by collecting real images obtained with
RGB sensors and available in public databases, and
additionally, synthetic images were generated using
DALL-E mini from short descriptions of the fruits
with the defects selected for the study.
3.2 Dataset Generation
The input dataset is composed of real and synthetic
images. A total of 20,000 images make up the dataset
with 10,000 for each type of image. The real images
were collected from various sources online includ-
ing image repositories, databases, and galleries. The
synthetic images were generated using DALL-E mini.
The dataset contains images of the previously selected
defects as well as the fruits in a fresh state. Real and
synthetic datasets of apple and mango images, with
and without defects, are provided publicly at https:
//github.com/luischuquim/Healthy-Defective-Fruits.
The reason for generating synthetic images was
to increase the volume of input images, given the
low availability of public datasets with specific im-
ages of the defects selected for classification. The
generation of the synthetic image dataset was per-
formed using the DALL-E mini text-to-image model.
This model is based on the BART language encoding
model and the VQGAN language decoding model can
generate images from short text, according to (Swords
et al., 2022). Using the web application available in
the HuggingFace community it was possible to ob-
tain synthetic images for the defect categories of both
fruits.
Given the variety in the quality of the datasets ob-
tained and generated, it was necessary to refine them.
The real image datasets were manually inspected to
eliminate images that did not correspond to the se-
lected defect classes and images of low quality. In
addition, in certain datasets, it was necessary to edit
the size of the images so that they contained only the
region of interest, i.e. the fruits. Due to the lack of
real image datasets for the specific defects it was nec-
essary to apply data augmentation techniques such as
randomly applying transformations, contrast changes
and rotations to the images using the OpenCV and Al-
bumentations libraries by means of Python scripts.
In the case of the synthetic dataset, refinement was
performed by manual selection of the images gener-
ated by DALL-E mini due to the variety of images
generated in each batch and their relevance to the
given descriptive text. Finally, the size of the images
was standardized to 256x256 pixels to reduce the size
of the dataset for training. Although the classification
performed during the experiment is binary, i.e., the
images were classified according to the presence or
absence of defects in the fruits, the amounts of input
images were balanced for each type of defect. Table
2 shows the amount of images for each fruit and cate-
gory in the dataset.
3.3 CNN Model Training
For training, the input images were organized in di-
rectories by fruit (apple and mango), each with 10,000
images and a partition of 80% for training, 10% for
testing (only real images), and 10% for validation.
The models selected for this study were Mo-
bileNetV2, InceptionV3, VGG16 and DenseNet121.
By means of the Keras and Tensor Flow libraries it is
possible to load the models and apply modifications to
them. First, the models were loaded using the weights
obtained from training with the ImageNet dataset. In
addition, the block of fully connected output layers
was eliminated, since it was used for multiclass clas-
sification, therefore, it must be adjusted to the current
binary classification problem.
For the binary classification task it is necessary to
adjust the fully connected output layers of the mod-
els. For all models a dense layer with 1024 units
and ReLU activation function was used, followed by a
dropout layer, then another dense layer with 512 units
and ReLU activation function. Finally, a layer of 2
units and softmax activation function is used to obtain
the one hot encoded vector with the classification. It
was decided to study the effect of the dropout layer on
the performance metrics obtained by the models, so
three types of tests were performed, one without us-
ing the dropout layer and the other two with dropout
rate of 0.2 and 0.5, in order to mitigate overfitting.
A total of 36 training runs were conducted based
VISAPP 2023 - 18th International Conference on Computer Vision Theory and Applications
276
Table 2: Dataset distribution.
Fruit Category
Dataset size
Real images Synthetic images Total per category Total
Apple
Fresh 2500 2500 5000
10000
Rot 1000 1000 2000
Bruise 750 750 1500
Scab 750 750 1500
Mango
Fresh 2500 2500 5000
10000
Rot 1000 1000 2000
Bruise 750 750 1500
Scab 750 750 1500
on a combination of the fruit, CNN model and the
following hyper-parameters: optimizer (RMSprop),
learning rate ( 0.001), batch size (16), and epochs (10,
20, 30).
4 RESULTS
This section details the results of the CNN models
evaluation based on the previously described method-
ology. For each fruit, a total of 36 training runs were
performed with variations in the CNN models, the
number of epochs and dropout rates used.
4.1 Results with the Apple Image
Dataset
Table 3 shows the results obtained for the dataset of
images of apples with and without defects for the four
selected models and the dropout rate and epoch vari-
ations. The best results for the test accuracy for each
model are highlighted.
Table 3: Comparison of CNN model accuracy with apple
image dataset.
CNN Model Accuracy
InceptionV3 93.40
MobilenetV2 97.50
Densenet121 94.50
VGG16 76.20
In the results obtained for these training runs, a
trend can be found in terms of accuracy and the com-
bination of epochs and dropout rate. The best metrics
for the accuracy of the test dataset were achieved with
the training of 30 epochs and dropout rate of 0.2.
Results show a trend where the accuracy improves
while training with more epochs and using a low
droput rate of 0.2. Also, not using dropout layers of
applying a dropout rate higher than 0.5 causes the ac-
curacy to decrease, possibly due to the loss of infor-
mation in the dense layers.
Figure 2: Defect detection accuracy for apple dataset.
As seen in Figure 2 the MobileNetV2 model
recorded the highest accuracy with the test dataset at
97.50%. The InceptionV3 and DenseNet121 mod-
els presented similar percentages in this metric with
93.40% and 94.50% respectively. The VGG16 model
presented the lowest accuracy with 76.20%.
Another metric considered in this analysis is the
loss during each training iteration. With this metric
it is possible to determine how effective is the defect
detection during training, with the 30 epochs training
3 of the 4 models (all except InceptionV3) managed
to reduce this metric below 0.5.
The high values for the loss metric indicate that
there is a large difference between the values of the
results and the expected values. In the case of the In-
ceptionV3 and MobileNetV2 models, there is a pos-
sibility of overfitting, given the high values in the loss
function while showing accuracy above 90%. This
overfitting phenomenon usually occurs in dense mod-
els since the gradient step must occur throughout the
neural network. Another cause of overfitting when
using CNN is the lack of regularization mechanisms
that allow the network to learn without memoriz-
ing, an alternative to the dropout used in the train-
ing of this study is the batch normalization accord-
ing to (Goutam et al., 2020). These models are able
to recognize with a high degree of efficiency the im-
ages belonging to the dataset, but present errors rec-
ognizing new images. To mitigate this problem with
these models, it is possible to perform training with
Fruit Defect Detection Using CNN Models with Real and Virtual Data
277
fewer epochs and experiment with more variations in
the dropout layer.
4.2 Results with the Mango Image
Dataset
For the other case study fruit, mango, the results ob-
tained with the selected models resemble those ob-
tained with the apple image dataset. The training runs
were performed using the same hyper-parameters
mentioned before. Table 4 shows the performance of
the models for this dataset. The row corresponding to
the best result obtained in the test accuracy for each
model is highlighted.
Table 4: Comparison of CNN model accuracy with mango
image dataset.
CNN Model Accuracy
InceptionV3 91.90
MobilenetV2 92.90
Densenet121 92.50
VGG16 63.60
The best results for these training runs were found
with the configuration of 30 epochs and dropout rate
of 0.2 for the MobileNetV2 model, with 92.90%
accuracy reached in testing. For the InceptionV3
and DenseNet121 models, the best performance in
the test dataset was obtained with the 20 and 30
epoch configurations respectively, both without us-
ing dropout layer resulting in accuracy values of
91.90% and 92.50% respectively. For this dataset
and fruit the VGG16 model underperformed reach-
ing only 63.60% accuracy during testing, as shown in
Figure 3.
Figure 3: Defect detection accuracy for mangoes dataset.
As the number of epochs increased, the resulting
accuracy remained stable and did not vary abruptly,
except with the VGG16 model. The architecture of
this model is quite extensive and includes hundreds
of millions of parameters, it is possible that a stabi-
lization of the obtained accuracy is achieved by train-
ing with more epochs. In the training with the mango
image dataset, all models obtained values higher than
90% from early epochs, indicating a high recognition
rate of the input images and the low presence of false
positives and false negatives in the results.
For this training dataset, the loss function values
were lower than with the apple dataset. After the 30
epochs, all models managed to reduce this metric to
less than 0.50.
In the case of the InceptionV3 model, it is possible
that overfitting exists given the loss values above 0.50
obtained with 20 and 30 training epochs. For this spe-
cific model, better results can be obtained by training
fewer epochs or applying variations with the dropout
layer.
5 CONCLUSIONS
During the dataset creation module it was possible
to collect a considerable amount of images (20,000,
that is: 10,000 images for apples and 10,000 images
for mangoes). This dataset contains real images and
synthetic images, for the case of real images pub-
lic databases were used and for the case of synthetic
images the text-to-image system DALL-E mini was
used. The dataset that was obtained comprises images
of healthy and rotten fruits, in addition to images of
defects based on the presented definitions of bumps,
scabs and black spots.
Since the evaluation of CNN models for defect
identification is a study that has already been carried
out, as an improvement we proposed the compilation
of a more robust dataset, not only in quantity but also
in quality. Also, the use of DALL-E mini allowed to
generate images for training purposes, these images
contain only the fruits with the desired defects and
free of background elements and noise. These defects
were represented with an equal number of images in
the dataset. Therefore, the CNN models were able to
identify the mangoes and apples with different types
of defects.
The previously generated dataset was used to eval-
uate the performance of the 4 CNN models, the pur-
pose of which was to identify whether the fruits under
study in this project (apple and mango) are defective
or fresh fruits. In addition, the identified defects are:
rot, bruises, black spots and scabs. The 4 models cho-
sen for this study were, MobileNetV2, InceptionV3,
DenseNet121 and VGG16, with accuracies in the test-
ing stage of: (0.975, 0.929), (0.934, 0.919), (0.945,
0.925) and (0.762, 0.6360) respectively; the values in
parentheses correspond to ”apple, mango” results. As
a result of the training and evaluation of the models
VISAPP 2023 - 18th International Conference on Computer Vision Theory and Applications
278
it was possible to determine that MobileNetV2 is the
CNN model that best fits the need for binary classifi-
cation for defect detection in apples and mangoes.
As future work, it is proposed to implement a mul-
ticlass classifier CNN model to classify images by
each defect found. In addition, we will focus on the
implementation of more powerful architectures such
as transformers for the detection of defects in fruits.
ACKNOWLEDGEMENTS
This work has been partially supported by the
ESPOL-CIDIS-11-2022 project.
REFERENCES
Arango, J. D., Staar, B., Baig, A. M., and Freitag, M.
(2021). Quality control of apples by means of convo-
lutional neural networks - comparison of bruise detec-
tion by color images and near-infrared images. Proce-
dia CIRP, 99:290–294.
Charco, J. L., Sappa, A. D., Vintimilla, B. X., and Vele-
saca, H. O. (2021). Camera pose estimation in multi-
view environments: From virtual scenarios to the real
world. Image and Vision Computing, 110:104182.
Crane, J. H. and Gazis, R. (2020). Bacterial black spot (bbs)
of mango in florida: Hs1369, 9/2020. EDIS, 2020(5).
Enicks, D. A., Bomberger, R. A., and Amiri, A. (2020).
Development of a portable lamp assay for detection of
neofabraea perennans in commercial apple fruit. Plant
Disease, 104(9):2346–2353.
Fu, Y., Nguyen, M., and Yan, W. Q. (2022). Grading meth-
ods for fruit freshness based on deep learning. SN
Computer Science, 3:264.
Goutam, K., Balasubramanian, S., Gera, D., and Sarma,
R. R. (2020). Layerout: Freezing layers in deep neural
networks. SN Computer Science, 1(5):1–9.
Han, K., Wang, Y., Chen, H., Chen, X., Guo, J., Liu, Z.,
Tang, Y., Xiao, A., Xu, C., Xu, Y., et al. (2022). A
survey on vision transformer. IEEE transactions on
pattern analysis and machine intelligence.
Karakaya, D., Ulucan, O., and Turkan, M. (2019). A
comparative analysis on fruit freshness classification.
2019 Innovations in Intelligent Systems and Applica-
tions Conference (ASYU).
Khan, A. I., Quadri, S., and Banday, S. (2021). Deep learn-
ing for apple diseases: classification and identifica-
tion. International Journal of Computational Intelli-
gence Studies, 10(1):1–12.
Koetter, F., Blohm, M., Drawehn, J., Kochanowski, M.,
Goetzer, J., Graziotin, D., and Wagner, S. (2019).
Conversational agents for insurance companies: from
theory to practice. In International Conference on
Agents and Artificial Intelligence, pages 338–362.
Springer.
Kolker, A., Oshchepkova, S., Pershina, Z., Dimitrov, L.,
Ivanov, V., Rashid, A., and Bdiwi, M. The ray tracing
based tool for generation artificial images and neural
network training.
Miah, M. S., Tasnuva, T., Islam, M., Keya, M., Rahman,
M. R., and Hossain, S. A. (2021). An advanced
method of identification fresh and rotten fruits using
different convolutional neural networks. 2021 12th
International Conference on Computing Communica-
tion and Networking Technologies (ICCCNT).
Munesue, Y., Masui, T., and Fushima, T. (2015). The
effects of reducing food losses and food waste on
global food insecurity, natural resources, and green-
house gas emissions. Environmental Economics and
Policy Studies, 17(1):43–77.
Nguyen, T.-V.-L., Nguyen, Q.-D., and Nguyen, P.-B.-D.
(2022). Drying kinetics and changes of total phenolic
content, antioxidant activity and color parameters of
mango and avocado pulp in refractance window dry-
ing. Polish Journal of Food and Nutrition Sciences,
72(1):27–38.
Nturambirwe, J. F. I. and Opara, U. L. (2020). Machine
learning applications to non-destructive defect detec-
tion in horticultural products. Biosystems Engineer-
ing, 189:60–83.
Omar, A. and MatJafri, M. (2013). Principles, method-
ologies and technologies of fresh fruit quality assur-
ance. Quality Assurance and Safety of Crops & Foods,
5:257–271.
Pathak, R. (2021). Classification of fruits using convo-
lutional neural network and transfer learning mod-
els. Journal of Management Information and Deci-
sion Sciences 1 Journal of Management Information
and Decision Sciences, 24:1–12.
Plowman, J. (2016). 3D Game Design with Unreal Engine
4 and Blender. Packt Publishing Ltd.
Swords, D. S., Bednarski, B. K., Messick, C. A., Tillman,
M. M., Chang, G. J., and You, Y. N. (2022). Quality
and location of the surgical episode mediate a large
proportion of socioeconomic-based survival dispari-
ties in patients with resected stage i–iii colon cancer.
Annals of surgical oncology, 29(1):706–716.
V G, N. and Pinto, A. (2021). Defects Detection in
Fruits and Vegetables Using Image Processing and
Soft Computing Techniques, pages 325–337.
Wu, A., Zhu, J., and Ren, T. (2020). Detection of apple
defect using laser-induced light backscattering imag-
ing and convolutional neural network. Computers &
Electrical Engineering, 81:106454.
Zhang, B., Huang, W., Li, J., Zhao, C., Fan, S., Wu, J., and
Liu, C. (2014). Principles, developments and appli-
cations of computer vision for external quality inspec-
tion of fruits and vegetables: A review. Food Research
International, 62:326–343.
Zheng, B. and Huang, T. (2021). Mango grading sys-
tem based on optimized convolutional neural network.
Mathematical Problems in Engineering, 2021:1–11.
Fruit Defect Detection Using CNN Models with Real and Virtual Data
279