Fruit Defect Detection Using CNN Models with Real and Virtual Data

Renzo Pacheco

1 a

, Paula Gonz

alez

1 b

, Luis E. Chuquimarca

1,2 c

, Boris X. Vintimilla

1 d

and Sergio A. Velastin

3,4 e

ESPOL Polytechnic University, ESPOL, CIDIS, Guayaquil, Ecuador

UPSE Santa Elena Peninsula State University, UPSE, FACSISTEL, La Libertad, Ecuador

Queen Mary University of London, London, U.K.

University Carlos III, Madrid, Spain

Keywords:

Fruit Defects, Convolutional Neural Networks, Real and Virtual Data.

Abstract:

The present study seeks to evaluate different CNN models in order to compare their performance in recogniz-

ing a range of defects in apples and mangoes to ensure the quality of the production of these foods. Using

the CNN models, InceptionV3, MobileNetV2, VGG16 and DenseNet121, which were trained with a dataset

of real and synthetic images of apples and mangoes covering fruit in acceptable quality condition and with

defects: rot, bruises, scabs and black spots. Training was performed with variations on the hyper-parameters

and the metric is accuracy. The MobileNetV2 model achieved the highest accuracy in training and testing,

obtaining 97.50% for apples and 92.50% for mangoes, making it the most suitable model for defect detection

in these fruits. The InceptionV3 and DenseNet121 models presented accuracy values above 90%, while the

VGG16 model obtained the poorest performance by not exceeding 80% accuracy for any of the fruits. The

trained models, especially MobileNetV2, are capable of recognizing a range of defects in the fruits under study

with a high degree of accuracy and are suitable for use in the development of automation applications for the

quality assessment process of apples and mangoes.

1 INTRODUCTION

The food industry is subject to strict quality standards

that govern all parts of the food production process. In

addition, ensuring the quality of food is of the utmost

importance for health. Therefore, the production of

high-quality food implies that the products meet ac-

ceptable characteristics (color, odor, texture, shape,

and defects) for inspectors and consumers. Accord-

ing to the Food and Agriculture Organization (FAO),

in Latin American countries between 10% and 20% of

harvested fruits and vegetables are discarded for var-

ious reasons, including non-compliance with quality

standards (Munesue et al., 2015).

The early identiﬁcation of defects in fruits is im-

portant to ensure the quality of these foods, to main-

tain their nutritional value and the satisfaction of the

ﬁnal consumer, and to avoid ﬁnancial losses for pro-

https://orcid.org/0000-0001-8162-4533

https://orcid.org/0000-0003-1302-9550

https://orcid.org/0000-0003-3296-4309

https://orcid.org/0000-0001-8904-0209

https://orcid.org/0000-0001-6775-7137

ducers. Currently, manual processes are used to in-

spect the quality of fruits. These techniques are slow,

imprecise and give room for the appearance of defects

that cause the fruit to be rejected in quality controls or

by consumers. To solve this problem, machine learn-

ing techniques have been studied to assess fruit qual-

ity more quickly and accurately.

The present work seeks to perform an evaluation

of CNN models for the detection of defects in apples

and mangoes. The evaluation is performed using a

set of real and synthetic images as input data to train

and validate the models. The defects to be identiﬁed

are: rot, bruises, scabs and black spots, caused by in-

sects, diseases, climatic conditions and post-harvest

handling.

This article is organized as follows: Section 2 con-

tains a review of fruit defect identiﬁcation. Section 3

describes the proposed methodology to develop the

work. Section 4 presents the results of the identiﬁ-

cation of apples and mangoes with defects using ex-

isting CNN models in the state of the art. Finally,

conclusions are given in Section 5.

272

Pacheco, R., González, P., Chuquimarca, L., Vintimilla, B. and Velastin, S.

Fruit Defect Detection Using CNN Models with Real and Virtual Data.

DOI: 10.5220/0011679800003417

In Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2023) - Volume 4: VISAPP, pages

272-279

ISBN: 978-989-758-634-7; ISSN: 2184-4321

 2023 by SCITEPRESS – Science and Technology Publications, Lda. Under CC license (CC BY-NC-ND 4.0)

2 LITERATURE REVIEW

This section deﬁnes the different types of defects con-

sidered in this study and their corresponding causes.

In addition, current research and advances in defect

detection techniques applied to apples and mangoes

are outlined.

2.1 Fruit and Vegetable Defects

Within the food industry there are several quality con-

trol processes for fruits and vegetables that determine

the quality of the food, based on parameters such as:

color, odor, ﬂavor, morphology and the presence of

defects (Zhang et al., 2014). Each of these aspects is

analyzed in detail to identify critical points in the ap-

pearance of the fruit or vegetable, and thus determine

whether it is prone to deteriorate or lose its nutritional

value before reaching the ﬁnal consumer.

To determine what type of defect is associated

with a fruit or vegetable, imperfections need to be

grouped into speciﬁc categories that deﬁne a possi-

ble cause of food defects. The categories that divide

these defects, depending on harvesting practices, sub-

sequent handling and storage conditions according to

(Nturambirwe and Opara, 2020):

• Pathological Disorders. These defects are

caused by pathogenic microorganisms such as

viruses, bacteria, and fungi. The presence of these

pathogenic entities mainly accelerates the ripen-

ing of the fruit until it rots (Nturambirwe and

Opara, 2020).

• Mechanical Damage. Environmental factors and

post-harvest handling are the main causes of this

type of damage. It manifests as variations in the

external coloration of the fruit and lesions, these

conditions in turn accelerate ripening and facili-

tate the development of infections (Nturambirwe

and Opara, 2020).

• Physiological Disorders. These disorders man-

ifest as changes in fruit ﬂavor, texture and color

and are caused by poor plant nutrition and unsuit-

able temperatures during fruit development (Ntu-

rambirwe and Opara, 2020).

• Morphological Disorders. These damages re-

fer to the presence of changes in the normal ap-

pearance and shape of the fruit. This type of de-

fects do not usually entail an alteration of the in-

trinsic qualities of the fruit such as its chemical

composition, nutritional value, odor, color, how-

ever, they impair its aesthetics (Nturambirwe and

Opara, 2020).

• Internal Defects. These defects are those that are

not apparent to the naked eye and may be pre-

cursors of other types of disorders of physiolog-

ical, morphological or mechanical damage (Ntu-

rambirwe and Opara, 2020).

2.2 Apple and Mango Defects

For the purpose of this project, defects in apples and

mangoes, which can be analyzed visually and are the

most common in quality control tasks for these fruits,

are explored (see Figure 1). The following defects are

considered:

Figure 1: DALL-E mini generated synthetic images of ap-

ple and mango defects: (a) fresh apple, (b) rotten apple, (c)

bruised apple, (d) scabbed apple, (e) fresh mango, (f) rotten

mango, (g) bruised mango, (h) mango with black spots.

• Rot. This defect originates from various factors

such as physical damage caused by the environ-

ment, insects and also bacterial and fungal in-

fections according to (Enicks et al., 2020). This

defect is observable to the naked eye and causes

changes in the coloration and texture of the fruit

skin, the skin turns brown, darkens progressively

and loses ﬁrmness. When appearing at the post-

harvest stage, the infections that cause this defect

can continue to spread slowly even during storage

under ideal conditions.

• Bruises. According to (Nguyen et al., 2022)

bruises on fruits such as mangoes result from im-

proper handling during handling and packing. In

general damage caused by bruising does not go

through the skin, but changes the coloration and

texture of the affected part. This is a type of de-

fect considered mechanical damage.

• Scabs. Scabbing is one of the most common con-

ditions on apples. This type of defect is caused by

fungal infections on both leaves and fruit, as in-

dicated by (Koetter et al., 2019). Unlike a bruise,

which does not affect the nutritional and chemical

properties of the fruit, scab makes the fruit inedi-

ble. This defect is manifested by cracked, dry and

dark colored skin in the affected area.

Fruit Defect Detection Using CNN Models with Real and Virtual Data

273

• Black Spots. According to (Crane and Gazis,

2020) black spots appear on mango as a symptom

of infection by bacteria such as Xanthomonas ax-

onopodis pv. mangiferaeindicae. The symptoms

of infection include the appearance of star-shaped

lesions that expand and darken over time. Even

the slightest degree of infection causes the fruit to

lose its quality and sales potential.

The defects selected for the study, were chosen

based on the availability of data for training the CNN

models with publicly available databases and the re-

view of other studies where these defects are also sub-

ject to analysis. Rot and bruising were determined as

defects to be analyzed in common for both fruits. In

addition, two defects that occur more frequently in

each fruit, scab in apples and black spots in mangoes,

were selected.

2.3 Fruit Defect Detection Techniques

The grading and inspection of fruit quality is carried

out with the purpose of providing the consumer with

a product free of externally perceptible physical de-

fects and with the assurance that the chemical prop-

erties and nutritional value of the product have not

been altered. The main external characteristics evalu-

ated to assign a quality grade to fruits and vegetables

are color, texture, size, shape and the presence of de-

fects. These indicators are key for the identiﬁcation of

conditions such as diseases that can contaminate en-

tire productions as suggested by (Omar and MatJafri,

2013).

Research by (Fu et al., 2022) indicates that qual-

ity control of fruits and vegetables is commonly per-

formed manually, with trained personnel identifying

defects. However, human intervention in this process

makes it susceptible to failures either by omission,

ambiguous grading criteria, and the number of de-

fects that personnel must be able to recognize. (Omar

and MatJafri, 2013) argue that defects such as bruis-

ing and rot do not always manifest themselves to a

degree that allows differentiation from a healthy area

of fruit, which is a challenge for computational evalu-

ation techniques that use criteria such as color, texture

and size to perform the analysis.

The use of CNN models is one of the most studied

methods for fruit classiﬁcation and defect detection.

(Wu et al., 2020) describe this type of neural networks

that implement convolution layers where matrix oper-

ations are performed between the input image matrix

and a smaller matrix called kernel, to extract features

from the input images. These networks also employ

pooling layers for the purpose of reducing the size of

the output of the convolution layers. Some CNN mod-

els apply regularization processes to avoid overﬁtting,

e.g., dropout. Finally, CNN networks perform classi-

ﬁcation by means of a fully connected layer that re-

ceives a one-dimensional or ﬂattened vector as output

from the convolution layers.

The study of (Pathak, 2021) proposes a fruit de-

fect detection method based on CNN models. They

used a proprietary model and the pre-trained models

AlexNet, Le-Net-5, VGG16 and VGG19, to detect rot

in apples, bananas and oranges. The results generated

indicate that this method is effective in the classiﬁ-

cation of defective and non-defective fruits achieving

an accuracy of 98.23% with their proposed model and

90.81% with VGG16, the model with second best per-

formance. In that study it was found that the trans-

fer learning methods were less effective than the pro-

posed model, however, the training of the models was

performed with only 4035 apple images for the two

categories ”fresh” (1693) and ”rotten” (2342).

According to (Wu et al., 2020), CNN models can

be used for the detection of fruit defects at the pre-

harvest and post-harvest stages. In the case of ap-

ples, one of the subject fruits of this study, using

the AlexNet model with 11 layers, an accuracy in

the identiﬁcation of defective fruits of 92.5% was

achieved having hyperspectral images. In the study

of (Fu et al., 2022), rot detection in apples was also

performed with RGB images as input obtaining the

lowest Mean Squared Error (MSE) and standard de-

viation with the AlexNet model, followed by VGG11.

For defect detection in mangoes the MobileNetV2

network model achieved an average accuracy of

73.33% in the research of (Zheng and Huang, 2021).

The defects detected for the assignment of a quality

grade were the presence of black spots and blemishes

on the outside of the fruit. In apples, bruising can be

evidenced by spots on the outside of the fruit and in

the study of (Arango et al., 2021) the DenseNet121

model is the most effective with an MCC of 0.978%,

performing the analysis on a dataset of RGB images

and compared with AlexNet, ResNet-18, ResNet-50

and VGG19 models.

In the experiments of (Miah et al., 2021), one of

the most promising results for identifying rot-related

defects in both apples and mangoes was obtained,

reaching an accuracy of 97.34%, these being the high-

est metrics achieved with the InceptionV3 model.

Table 1 details the results of the state of the art

review for fruit defect detection using CNN models,

recording the models, metrics and data used in these

studies.

The studies presented differ in the size of their

databases, the partitioning of the data for training and

validation, the hyper-parameters selected, and the de-

VISAPP 2023 - 18th International Conference on Computer Vision Theory and Applications

274

Table 1: Comparison of CNN models performance for defect detection on apples and mangoes.

Fruit Defect CNN Model Metric Performance

Apple (Karakaya et al., 2019) Rot ResNet-50 Accuracy 97.00%

Apple/

Mango (Miah et al., 2021)

Rot

InceptionV3

Xception

VGG16

MobileNet

NASNetMobile

Accuracy

97.34%

97.16%

96.54%

95.57%

75.29%

Apple (Pathak, 2021) Rot

VGG16

VGG19

LeNet-5

AlexNet

Accuracy

90.81%

76.48%

82.93%

83.56%

Apple (Fu et al., 2022) Rot

GoogleNet

VGG11

AlexNet

ResNet

MSE

4.404

3.934

4.099

4.058

Apple (Arango et al., 2021) Bruises

VGG19

RestNet-50

DenseNet121

MCC

0.969

0.970

0.978

Mango (Zheng and Huang, 2021) Black spots

SqueezeNet

ResNet-50

Enhanced

MobileNetV2

Accuracy

96.94%

86.67%

96.67%

73.33%

Apple (Khan et al., 2021) Scabs VGG16 Accuracy 87.50%

fects they cover in their input images. These studies

limit the databases to less than 4000 images on aver-

age and less than 2000 images per category. On the

other hand, the defects covered in the mentioned stud-

ies are limited to categorizing the input images ac-

cording to the presence or not of defects, which may

inﬂuence the ability of the models to detect certain

types of defects with less presence in the dataset.

In the following section, the methodology used for

data collection, training and evaluation of the mod-

els is detailed. The selected databases and the pro-

cess of reﬁnement and data augmentation are pre-

sented. Likewise, the generation of a synthetic image

database is detailed. In addition, the architecture of

the selected models and the training process with the

adjusted hyper-parameters will be presented.

3 PROPOSED METHODOLOGY

The following section describes the methodology

used to obtain the dataset and the training of the

CNN models selected for the identiﬁcation of defec-

tive fruits. On the other hand, the aspects of the pro-

cess carried out to reach the necessary amount of real

and synthetic images of apples and mangoes are de-

tailed. In addition, a brief description of each model

chosen for the training and subsequent evaluation is

shown.

3.1 Image Acquisition Techniques

There are several image acquisition methods avail-

able depending on the characteristics and defects of

the fruit to be analyzed.

To capture real images, cameras in the RGB do-

main are used to explore surface defects that manifest

themselves in changes of color and texture of the fruit

skin, as in the study by (V G and Pinto, 2021). It is

also possible to use hyperspectral cameras as used in

(Wu et al., 2020) where images were obtained to de-

tect scabs, bruises, rotting and other defects in apples.

In the case of synthetic images, some techniques

can be highlighted to create this type of image. The

generation of synthetic datasets is not a new prac-

tice, in (Charco et al., 2021) 3000 images generated

in a virtual environment with the CARLA software

were used for the estimation of camera poses using

the CNN ResNet-50 architecture with modiﬁcations.

Also, by means of Ray Tracing it is possible to recre-

ate objects in virtual environments using the proper-

ties of light rays.

The proposal of (Kolker et al., ) consists of model-

ing objects in 3D to use software to create a realistic

visualization of the object after performing calcula-

tions based on the behavior of light directed at it from

different angles and sources. This technique pursues

this photorealism using natural laws of light propaga-

tion, reﬂection, and refraction, which depend on the

materials.

Fruit Defect Detection Using CNN Models with Real and Virtual Data

275

(Plowman, 2016) indicates that other methods for

generating synthetic images include modeling and

rendering the objects in tools such as Blender and Un-

real Engine, allowing greater control over the desired

characteristics and conditions in the dataset. Since 3D

modeling from scratch can be a slow and expensive

process, an alternative for generating synthetic data is

the capture of 3D models using LiDAR sensors and

applications that use the data from these sensors to

reconstruct objects in a virtual environment.

Finally, there is the possibility of using recently

published tools to generate synthetic images us-

ing machine learning systems such as DALL-E and

DALL-E mini models. These types of systems are

capable of generating high-resolution images (in the

case of DALL-E) from input text describing the de-

sired conditions and details of objects or scenes, (Han

et al., 2022).

In this research work, the dataset generation was

performed by collecting real images obtained with

RGB sensors and available in public databases, and

additionally, synthetic images were generated using

DALL-E mini from short descriptions of the fruits

with the defects selected for the study.

3.2 Dataset Generation

The input dataset is composed of real and synthetic

images. A total of 20,000 images make up the dataset

with 10,000 for each type of image. The real images

were collected from various sources online includ-

ing image repositories, databases, and galleries. The

synthetic images were generated using DALL-E mini.

The dataset contains images of the previously selected

defects as well as the fruits in a fresh state. Real and

synthetic datasets of apple and mango images, with

and without defects, are provided publicly at https:

//github.com/luischuquim/Healthy-Defective-Fruits.

The reason for generating synthetic images was

to increase the volume of input images, given the

low availability of public datasets with speciﬁc im-

ages of the defects selected for classiﬁcation. The

generation of the synthetic image dataset was per-

formed using the DALL-E mini text-to-image model.

This model is based on the BART language encoding

model and the VQGAN language decoding model can

generate images from short text, according to (Swords

et al., 2022). Using the web application available in

the HuggingFace community it was possible to ob-

tain synthetic images for the defect categories of both

fruits.

Given the variety in the quality of the datasets ob-

tained and generated, it was necessary to reﬁne them.

The real image datasets were manually inspected to

eliminate images that did not correspond to the se-

lected defect classes and images of low quality. In

addition, in certain datasets, it was necessary to edit

the size of the images so that they contained only the

region of interest, i.e. the fruits. Due to the lack of

real image datasets for the speciﬁc defects it was nec-

essary to apply data augmentation techniques such as

randomly applying transformations, contrast changes

and rotations to the images using the OpenCV and Al-

bumentations libraries by means of Python scripts.

In the case of the synthetic dataset, reﬁnement was

performed by manual selection of the images gener-

ated by DALL-E mini due to the variety of images

generated in each batch and their relevance to the

given descriptive text. Finally, the size of the images

was standardized to 256x256 pixels to reduce the size

of the dataset for training. Although the classiﬁcation

performed during the experiment is binary, i.e., the

images were classiﬁed according to the presence or

absence of defects in the fruits, the amounts of input

images were balanced for each type of defect. Table

2 shows the amount of images for each fruit and cate-

gory in the dataset.

3.3 CNN Model Training

For training, the input images were organized in di-

rectories by fruit (apple and mango), each with 10,000

images and a partition of 80% for training, 10% for

testing (only real images), and 10% for validation.

The models selected for this study were Mo-

bileNetV2, InceptionV3, VGG16 and DenseNet121.

By means of the Keras and Tensor Flow libraries it is

possible to load the models and apply modiﬁcations to

them. First, the models were loaded using the weights

obtained from training with the ImageNet dataset. In

addition, the block of fully connected output layers

was eliminated, since it was used for multiclass clas-

siﬁcation, therefore, it must be adjusted to the current

binary classiﬁcation problem.

For the binary classiﬁcation task it is necessary to

adjust the fully connected output layers of the mod-

els. For all models a dense layer with 1024 units

and ReLU activation function was used, followed by a

dropout layer, then another dense layer with 512 units

and ReLU activation function. Finally, a layer of 2

units and softmax activation function is used to obtain

the one hot encoded vector with the classiﬁcation. It

was decided to study the effect of the dropout layer on

the performance metrics obtained by the models, so

three types of tests were performed, one without us-

ing the dropout layer and the other two with dropout

rate of 0.2 and 0.5, in order to mitigate overﬁtting.

A total of 36 training runs were conducted based

VISAPP 2023 - 18th International Conference on Computer Vision Theory and Applications

276

Table 2: Dataset distribution.

Fruit Category

Dataset size

Real images Synthetic images Total per category Total

Apple

Fresh 2500 2500 5000

10000

Rot 1000 1000 2000

Bruise 750 750 1500

Scab 750 750 1500

Mango

Fresh 2500 2500 5000

10000

Rot 1000 1000 2000

Bruise 750 750 1500

Scab 750 750 1500

on a combination of the fruit, CNN model and the

following hyper-parameters: optimizer (RMSprop),

learning rate ( 0.001), batch size (16), and epochs (10,

20, 30).

4 RESULTS

This section details the results of the CNN models

evaluation based on the previously described method-

ology. For each fruit, a total of 36 training runs were

performed with variations in the CNN models, the

number of epochs and dropout rates used.

4.1 Results with the Apple Image

Dataset

Table 3 shows the results obtained for the dataset of

images of apples with and without defects for the four

selected models and the dropout rate and epoch vari-

ations. The best results for the test accuracy for each

model are highlighted.

Table 3: Comparison of CNN model accuracy with apple

image dataset.

CNN Model Accuracy

InceptionV3 93.40

MobilenetV2 97.50

Densenet121 94.50

VGG16 76.20

In the results obtained for these training runs, a

trend can be found in terms of accuracy and the com-

bination of epochs and dropout rate. The best metrics

for the accuracy of the test dataset were achieved with

the training of 30 epochs and dropout rate of 0.2.

Results show a trend where the accuracy improves

while training with more epochs and using a low

droput rate of 0.2. Also, not using dropout layers of

applying a dropout rate higher than 0.5 causes the ac-

curacy to decrease, possibly due to the loss of infor-

mation in the dense layers.

Figure 2: Defect detection accuracy for apple dataset.

As seen in Figure 2 the MobileNetV2 model

recorded the highest accuracy with the test dataset at

97.50%. The InceptionV3 and DenseNet121 mod-

els presented similar percentages in this metric with

93.40% and 94.50% respectively. The VGG16 model

presented the lowest accuracy with 76.20%.

Another metric considered in this analysis is the

loss during each training iteration. With this metric

it is possible to determine how effective is the defect

detection during training, with the 30 epochs training

3 of the 4 models (all except InceptionV3) managed

to reduce this metric below 0.5.

The high values for the loss metric indicate that

there is a large difference between the values of the

results and the expected values. In the case of the In-

ceptionV3 and MobileNetV2 models, there is a pos-

sibility of overﬁtting, given the high values in the loss

function while showing accuracy above 90%. This

overﬁtting phenomenon usually occurs in dense mod-

els since the gradient step must occur throughout the

neural network. Another cause of overﬁtting when

using CNN is the lack of regularization mechanisms

that allow the network to learn without memoriz-

ing, an alternative to the dropout used in the train-

ing of this study is the batch normalization accord-

ing to (Goutam et al., 2020). These models are able

to recognize with a high degree of efﬁciency the im-

ages belonging to the dataset, but present errors rec-

ognizing new images. To mitigate this problem with

these models, it is possible to perform training with

Fruit Defect Detection Using CNN Models with Real and Virtual Data

277

fewer epochs and experiment with more variations in

the dropout layer.

4.2 Results with the Mango Image

Dataset

For the other case study fruit, mango, the results ob-

tained with the selected models resemble those ob-

tained with the apple image dataset. The training runs

were performed using the same hyper-parameters

mentioned before. Table 4 shows the performance of

the models for this dataset. The row corresponding to

the best result obtained in the test accuracy for each

model is highlighted.

Table 4: Comparison of CNN model accuracy with mango

image dataset.

CNN Model Accuracy

InceptionV3 91.90

MobilenetV2 92.90

Densenet121 92.50

VGG16 63.60

The best results for these training runs were found

with the conﬁguration of 30 epochs and dropout rate

of 0.2 for the MobileNetV2 model, with 92.90%

accuracy reached in testing. For the InceptionV3

and DenseNet121 models, the best performance in

the test dataset was obtained with the 20 and 30

epoch conﬁgurations respectively, both without us-

ing dropout layer resulting in accuracy values of

91.90% and 92.50% respectively. For this dataset

and fruit the VGG16 model underperformed reach-

ing only 63.60% accuracy during testing, as shown in

Figure 3.

Figure 3: Defect detection accuracy for mangoes dataset.

As the number of epochs increased, the resulting

accuracy remained stable and did not vary abruptly,

except with the VGG16 model. The architecture of

this model is quite extensive and includes hundreds

of millions of parameters, it is possible that a stabi-

lization of the obtained accuracy is achieved by train-

ing with more epochs. In the training with the mango

image dataset, all models obtained values higher than

90% from early epochs, indicating a high recognition

rate of the input images and the low presence of false

positives and false negatives in the results.

For this training dataset, the loss function values

were lower than with the apple dataset. After the 30

epochs, all models managed to reduce this metric to

less than 0.50.

In the case of the InceptionV3 model, it is possible

that overﬁtting exists given the loss values above 0.50

obtained with 20 and 30 training epochs. For this spe-

ciﬁc model, better results can be obtained by training

fewer epochs or applying variations with the dropout

layer.

5 CONCLUSIONS

During the dataset creation module it was possible

to collect a considerable amount of images (20,000,

that is: 10,000 images for apples and 10,000 images

for mangoes). This dataset contains real images and

synthetic images, for the case of real images pub-

lic databases were used and for the case of synthetic

images the text-to-image system DALL-E mini was

used. The dataset that was obtained comprises images

of healthy and rotten fruits, in addition to images of

defects based on the presented deﬁnitions of bumps,

scabs and black spots.

Since the evaluation of CNN models for defect

identiﬁcation is a study that has already been carried

out, as an improvement we proposed the compilation

of a more robust dataset, not only in quantity but also

in quality. Also, the use of DALL-E mini allowed to

generate images for training purposes, these images

contain only the fruits with the desired defects and

free of background elements and noise. These defects

were represented with an equal number of images in

the dataset. Therefore, the CNN models were able to

identify the mangoes and apples with different types

of defects.

The previously generated dataset was used to eval-

uate the performance of the 4 CNN models, the pur-

pose of which was to identify whether the fruits under

study in this project (apple and mango) are defective

or fresh fruits. In addition, the identiﬁed defects are:

rot, bruises, black spots and scabs. The 4 models cho-

sen for this study were, MobileNetV2, InceptionV3,

DenseNet121 and VGG16, with accuracies in the test-

ing stage of: (0.975, 0.929), (0.934, 0.919), (0.945,

0.925) and (0.762, 0.6360) respectively; the values in

parentheses correspond to ”apple, mango” results. As

a result of the training and evaluation of the models

VISAPP 2023 - 18th International Conference on Computer Vision Theory and Applications

278

it was possible to determine that MobileNetV2 is the

CNN model that best ﬁts the need for binary classiﬁ-

cation for defect detection in apples and mangoes.

As future work, it is proposed to implement a mul-

ticlass classiﬁer CNN model to classify images by

each defect found. In addition, we will focus on the

implementation of more powerful architectures such

as transformers for the detection of defects in fruits.

ACKNOWLEDGEMENTS

This work has been partially supported by the

ESPOL-CIDIS-11-2022 project.

REFERENCES

Arango, J. D., Staar, B., Baig, A. M., and Freitag, M.

(2021). Quality control of apples by means of convo-

lutional neural networks - comparison of bruise detec-

tion by color images and near-infrared images. Proce-

dia CIRP, 99:290–294.

Charco, J. L., Sappa, A. D., Vintimilla, B. X., and Vele-

saca, H. O. (2021). Camera pose estimation in multi-

view environments: From virtual scenarios to the real

world. Image and Vision Computing, 110:104182.

Crane, J. H. and Gazis, R. (2020). Bacterial black spot (bbs)

of mango in ﬂorida: Hs1369, 9/2020. EDIS, 2020(5).

Enicks, D. A., Bomberger, R. A., and Amiri, A. (2020).

Development of a portable lamp assay for detection of

neofabraea perennans in commercial apple fruit. Plant

Disease, 104(9):2346–2353.

Fu, Y., Nguyen, M., and Yan, W. Q. (2022). Grading meth-

ods for fruit freshness based on deep learning. SN

Computer Science, 3:264.

Goutam, K., Balasubramanian, S., Gera, D., and Sarma,

R. R. (2020). Layerout: Freezing layers in deep neural

networks. SN Computer Science, 1(5):1–9.

Han, K., Wang, Y., Chen, H., Chen, X., Guo, J., Liu, Z.,

Tang, Y., Xiao, A., Xu, C., Xu, Y., et al. (2022). A

survey on vision transformer. IEEE transactions on

pattern analysis and machine intelligence.

Karakaya, D., Ulucan, O., and Turkan, M. (2019). A

comparative analysis on fruit freshness classiﬁcation.

2019 Innovations in Intelligent Systems and Applica-

tions Conference (ASYU).

Khan, A. I., Quadri, S., and Banday, S. (2021). Deep learn-

ing for apple diseases: classiﬁcation and identiﬁca-

tion. International Journal of Computational Intelli-

gence Studies, 10(1):1–12.

Koetter, F., Blohm, M., Drawehn, J., Kochanowski, M.,

Goetzer, J., Graziotin, D., and Wagner, S. (2019).

Conversational agents for insurance companies: from

theory to practice. In International Conference on

Agents and Artiﬁcial Intelligence, pages 338–362.

Springer.

Kolker, A., Oshchepkova, S., Pershina, Z., Dimitrov, L.,

Ivanov, V., Rashid, A., and Bdiwi, M. The ray tracing

based tool for generation artiﬁcial images and neural

network training.

Miah, M. S., Tasnuva, T., Islam, M., Keya, M., Rahman,

M. R., and Hossain, S. A. (2021). An advanced

method of identiﬁcation fresh and rotten fruits using

different convolutional neural networks. 2021 12th

International Conference on Computing Communica-

tion and Networking Technologies (ICCCNT).

Munesue, Y., Masui, T., and Fushima, T. (2015). The

effects of reducing food losses and food waste on

global food insecurity, natural resources, and green-

house gas emissions. Environmental Economics and

Policy Studies, 17(1):43–77.

Nguyen, T.-V.-L., Nguyen, Q.-D., and Nguyen, P.-B.-D.

(2022). Drying kinetics and changes of total phenolic

content, antioxidant activity and color parameters of

mango and avocado pulp in refractance window dry-

ing. Polish Journal of Food and Nutrition Sciences,

72(1):27–38.

Nturambirwe, J. F. I. and Opara, U. L. (2020). Machine

learning applications to non-destructive defect detec-

tion in horticultural products. Biosystems Engineer-

ing, 189:60–83.

Omar, A. and MatJafri, M. (2013). Principles, method-

ologies and technologies of fresh fruit quality assur-

ance. Quality Assurance and Safety of Crops & Foods,

5:257–271.

Pathak, R. (2021). Classiﬁcation of fruits using convo-

lutional neural network and transfer learning mod-

els. Journal of Management Information and Deci-

sion Sciences 1 Journal of Management Information

and Decision Sciences, 24:1–12.

Plowman, J. (2016). 3D Game Design with Unreal Engine

4 and Blender. Packt Publishing Ltd.

Swords, D. S., Bednarski, B. K., Messick, C. A., Tillman,

M. M., Chang, G. J., and You, Y. N. (2022). Quality

and location of the surgical episode mediate a large

proportion of socioeconomic-based survival dispari-

ties in patients with resected stage i–iii colon cancer.

Annals of surgical oncology, 29(1):706–716.

V G, N. and Pinto, A. (2021). Defects Detection in

Fruits and Vegetables Using Image Processing and

Soft Computing Techniques, pages 325–337.

Wu, A., Zhu, J., and Ren, T. (2020). Detection of apple

defect using laser-induced light backscattering imag-

ing and convolutional neural network. Computers &

Electrical Engineering, 81:106454.

Zhang, B., Huang, W., Li, J., Zhao, C., Fan, S., Wu, J., and

Liu, C. (2014). Principles, developments and appli-

cations of computer vision for external quality inspec-

tion of fruits and vegetables: A review. Food Research

International, 62:326–343.

Zheng, B. and Huang, T. (2021). Mango grading sys-

tem based on optimized convolutional neural network.

Mathematical Problems in Engineering, 2021:1–11.

Fruit Defect Detection Using CNN Models with Real and Virtual Data

279