Apple Classification Based on HOG, KNN and SVM

Jiahui Huang

College of Information Science and Technology, Beijing University of Chemical Technology, Beijing, China

Keywords: Apple Classification, HOG, KNN, SVM.

Abstract: In the rapid development of deep learning, traditional machine learning in the field of classification has the

advantages of simplicity, ease of understanding, and strong interpretability. Apples, as an important global

agricultural product, bring a lot of economic value and have health benefits for human beings. However, they

are time-consuming and labour-intensive to sort manually. Therefore, realizing the intelligence of

classification process is helpful to improve economic efficiency. For the apple dataset with high similarity,

this research adopts two models, k-nearest neighbor (KNN) and support vector machine (SVM), and combines

four models with two features, Histogram of Oriented Gradients (HOG) feature extraction and original

features, to compare and research the models suitable for apple classification. It is found that HOG features

do not perform well on apple images of similar shape and size, but both SVM and KNN using raw features

show good performance on both training and test sets. The proposed method is simple to implement, has high

accuracy and is suitable for further extension of application to other fruit domains.

1 INTRODUCTION

Apple, as one of the most important agricultural

products all over the world, brings a lot of economic

benefits. Selection and sorting after harvesting are an

important part of the commercialization process.

Sorting apples manually is time-consuming and

labor-intensive for different varieties of apples. Some

different kinds of apples are very close to each other

in terms of shape, color and size. The shortcoming of

detection will finally result in loss of efficiency.

Automate and de-manipulate the apple classification

process by extracting features of different apples can

effectively increase the speed, save money and time

costs. Trying to research a model with high accuracy

in apple classification task requires a combination of

both feature extraction method and classifiers.

Finding the best of data features and classifier is of

great importance.

Currently, many apple classification methods have

been proposed. In Bhargava and Bansal’s work in

2021, they segmented the apple images by the grab-

cut method and fuzzy c-means clustering, extracted

multiple features, and used principal component

analysis (PCA) to select them. The classification was

done by applying KNN, LR, SRC, and SVM

classifiers. The cross-validation technique with

distinct values of k was used to validate the

performance of the system. The method obtained

more than 95% accuracy on the SVM model when k

equals to 10 (Bhargava and Bansal 2021). A. K. Bhatt

and D. Pant together trained a back-propagation

neural network to classify apples, using surface apple

quality parameters as the independent variables and

apple quality as the dependent variable (Bhatt and

Pant 2015). The experimental results obtained are in

good agreement with the true values and have shorter

computation time and higher accuracy (Bhatt and

Pant 2015). Misigo. R investigate the applicability

and performance of Naive Bayes algorithm in

classification of apple fruit varieties and compare the

performance of Park Bayes technique with principal

components, fuzzy logic, MLP neural performance

(Misigo 2016). These methods either use relatively

expensive instruments, such as X-ray scanners, near-

infrared spectrometers, and industrial cameras, or are

complex to operate, require the design of specialized

hardware modules, and have cumbersome procedures

that are not applicable to the promotion of their use in

the market.

Using a combination of HOG feature extraction

method and machine learning models has been widely

researched by scholars before. Xin Guo et al.

addressed the problem of poor recognition ability of

traditional algorithms for small fruit targets in natural

environments by classifying apples with improved

Huang, J.

Apple Classiﬁcation Based on HOG, KNN and SVM.

DOI: 10.5220/0012814800003885

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 1st International Conference on Data Analysis and Machine Learning (DAML 2023), pages 451-456

ISBN: 978-989-758-705-4

451

HOG and SVM with the Focus plus CSP cascade

module added for deep feature extraction (Guo et al

2022). The combination of HOG and KNN is also

widely used. FAIA Putra et al. proposed a vision-

based vehicle detection system with HOG feature

detection and KNN classifier (Putra et al 2020). Hivi

et al. applied PCA downscaled HOG feature and use

SVM, KNN and Multilayer Perceptron Neural

Network (MLPNN) three different classifiers to

recognize face expression (Dino and Abdulrazzaq

2019).

According to Liu’s work in 2019, he concluded

through experiments that machine learning has the

advantage of simplicity and efficiency in image

classification and recognition compared to neural

networks in classifying small samples of data. Based

on his work and related research by other scholars,

this research further explores the performance of

traditional machine learning on apple classification.

2 METHODS

This research consists of three main parts. The first

part is to preprocess the fruit dataset by selecting all

the apple image classes to form the apple dataset. The

second part is to extract HOG features and original

features for each image for subsequent comparison.

The last part is to classify the images based on the

features using two classification tools KNN and SVM.

2.1 Data Visualization

The research is based on the Fruits-360 dataset. Up to

now, there are 61934 images of 90 kinds of fruits with

hundreds of shooting angles in the set, each image is

formatted as 100100 pixels. The advantage of this

dataset is that its images have the object without the

noisy background, which may avoid reduction of the

classification accuracy when changing the

background environment of the images.

The research selected all the apple varieties to form

the apple dataset, and all the apple classes in it are

shown in Fig. 1.

After selecting, the apple dataset has a total of 8538

images, of which 6404 are training images and 2134

are test images. The ratio of training set to test set is

3:1, The number of training and test sets for each

apple species is shown in Fig. 2

As shown in Fig. 2, some apple classes are

duplicated. For example, Apple Red 1 and Apple Red

2 should be in the same class. By merging all identical

classes, the final dataset is shown in Fig. 3.

2.2 HOG Feature Extraction

The HOG feature detection algorithm, an image

descriptor that addresses human target detection, was

first proposed at CVPR-2005 by French researcher

Dalal et al. The feature extraction method is widely

used in computer vision and image processing, mainly

for detecting and recognizing objects in an image by

extracting the feature descriptors of the image. HOG

is aiming to describe an image with a locally oriented

gradient histogram that represent occurrences of

specific gradient direction in local parts of the image.

The steps of the extraction algorithm for realizing

HOG features are in Fig. 4.

2.3 Machine Learning Classifiers

Apple classification task is viewed as a part of

Supervised learning. Due to multiple apple labels,

some models like KNN, SVM and NB are the three

most used methods for multiple classification

(Binkhonain and Zhao 2019). In this research, the

KNN and SVM models are used to classify apple

classes with features extracted and not extracted with

HOG respectively.

Figure 1: All the apple classes (Photo credit: Original).

Figure 2: Selected apple dataset (Photo credit: Original).

DAML 2023 - International Conference on Data Analysis and Machine Learning

452

Figure 3: Classes of apples after merging (Photo credit:

Original).

Figure 4: HOG feature extraction process (Photo credit:

Original).

2.3.1 K-Nearest Neighbor

The work uses KNN to classify different kind of

apples. Unlike other complex deep learning models,

KNN is an easy understanding and not time-

consuming algorithm. For classification problems, the

main idea of the KNN algorithm is to decide the class

of a sample point based on the class of the k nearest

neighbors around that sample point. For the HOG

feature vector of the test image, calculate the distance

between it and the HOG feature vector of each sample

in the training set. This can be done using different

distance metrics, commonly the Euclidean distance,

seen in the Eq. (1), where d is the Euclidean distance,

x stands for test image, y stand for training image, and

n is the total number of neighbors.

𝑑



∑

𝑥



𝑦











(1)

The value of k determines how many neighbors a

test point decides its own category based on the

condition of using the Euclidean distance. Generally,

different values of k will lead to greatly distinct

results. Smaller values of k result in higher model

complexity that is prone to overfitting. Predictions of

test points are very sensitive to neighboring instance

points. Larger values of k will make the model too

generalized to accurately predict the testing data

points, which is known as underfitting. To find the

best k value, the common method is to use K-fold

cross validation. The basic idea is dividing the dataset

into k groups with same size, keeping one-fold for

testing and other k-1 folds for training. The process

needs k times and each time different fold are used for

validation.

2.3.2 Support Vector Machine

SVM is a popular machine method due to its high

learning qualities and well results. Built by Vapink’s

study in 2013, the model seeks the optimal balance

between learning capacity and complexity based on

small and medium-sized dataset, which gives it a

strong ability to generalize. The main idea is to

discover a hyperplane to correctly classify data points

from different categories (Çakir et al 2023)

Erro! A origem

da referência não foi encontrada.

. In high-dimensional space, the

optimal function expression of a hyperplane is in Eq.

(2), where w is the weighted vector, x is the input

feature vector, b is the distance between the data point

and the hyperplane.

𝑤



𝑥𝑏0 (2)

The w and b need to fulfil the following

inequalities, seen in Eq. (3):



𝑤𝑥





𝑏1 𝑖𝑓 𝑦



 1

𝑤𝑥





𝑏1 𝑖𝑓 𝑦



1

(3)

Then, the distance from any point (x1, x2) in space

to the target hyperplane can be expressed according to

Eq. (4):

𝑟

|



|

||||

(4)

If the function interval r is made equal to 1, then

there is Eq. (5):

𝑁







||||

(5)

Therefore, the formula for taking the interval

maximization is shown in Eq. (6):

𝑓 𝑚𝑎𝑥



||||

(6)

2.4 Evaluation Criteria

To assess the performance of the model, four

evaluation metrics are used in this research: Accuracy,

Recall, Precision and F1-score. The assessment

metrics are determined by the True Positive (TP), True

Negative (TN), False Positive (FP), and False

Negative (FN) values, as shown in Table 1.

Apple Classiﬁcation Based on HOG, KNN and SVM

453

Table 1: Confusion Matrix of classification.

Predicted

Actual

positive negative

positive TP FP

negative FN TN

Since this research is a multi-category problem,

Macro Average rule is used for calculating Recall,

Precision and F1-score. By using the confusion

matrix, they are calculated separately for each

category and then averaged (Zheng 2022).

2.4.1 Accuracy

The proportion of correctly categorized samples to

total samples. It is calculated by multiplying the

accuracy of each category by the proportion of that

category in the total sample and then summing. The

formular is given in Eq. (7):

𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 

∑







∑















(7)

2.4.2 Recall

The proportion of correct predictions that are positive

to all that are actually positive, as shown in Eq. (8):

𝑅𝑒𝑐𝑎𝑙𝑙 





∑





















(8)

2.4.3 Precision

The proportion of all predictions that are correctly

predicted to be positive, as shown in Eq. (9):

𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 





∑





















(9)

2.4.4 F1-Score

As defined in Eq. (10), the F1 score can be thought of

as a kind of reconciled average of model precision

and recall, which has a maximum value of 1 and a

minimum value of 0.

𝐹1 





∑

























(10)

3 RESULT AND DISCUSSION

3.1 HOG Feature Extraction Result

In accordance with the HOG feature extraction

process in Fig. 4, HOG features are extracted from all

the images, and Fig. 5 is an example of a comparison

image before and after extraction.

3.2 Predict Result

In this research of KNN classification model, the

range of k values is set from 1 to 20, and the score of

each k value with raw features and HOG features are

shown in Fig. 6. The two figures illustrate that the

model runs best when k equals to 1 for both raw

features and HOG features. Considering both model

accuracy and overfitting, the k value of 3 is chosen to

build the KNN model.

Figure 5: The comparisions of images before and after

HOG feature extraction (Photo credit: Original).

(a)

(b)

Figure 6: The score of KNN for different k. (a) KNN with

raw features (b) KNN with HOG features (Photo credit:

Original).

DAML 2023 - International Conference on Data Analysis and Machine Learning

454

3.3 Evaluation

In order to evaluate the performance of each

classification model, the research compares the

predicted results with the actual results of the testing

set. The confusion matrix is shown as Fig. 7, showing

that KNN and SVM both perform well on predicting

the test set when trained with raw features compared

to using HOG features. It can be seen that for all

model, Apple Red Delicious are correctly classified,

Apple Braeburn has the worst classification result

with relatively high probability of being classified as

Apple Red and Apple Red Yellow. According to Fig.

7, Apple Braeburn is the biggest challenge for these

models with the lowest F1-score, precision and recall

， even for the least misclassified SVM model with

raw features. The evaluation scores of each model are

shown in Fig. 8. This clearly shows that SVM with

raw features performs best on the apple classification

task with the highest Precision, Recall, F1-score and

Accuracy, KNN with raw features second. Both KNN

and SVM with HOG features don’t perform that well.

(a) (b)

Figure 7: The confusion matrix of model. (a) SVM with

HOG features, (b) SVM with raw features, (c) KNN with

HOG features, (d) KNN with raw features. (Photo credit:

Original).

Figure 8: Evaluation score of each model (Photo credit:

Original).

In this research, a total of four models are

constructed by combining HOG features and original

features, KNN and SVM. Each model is trained in a

short period of time and achieved more than 90%

accuracy. It can be seen from the result that the value

of k has a significant effect on the predict outcome.

According to the cross-validation result of each k and

the effect of overfitting, k equals to 3 is chosen to train

the KNN.

Since apples are similar in shape and size between

some of the different species, the model using HOG

features is not as well as the model that uses the raw

features directly. SVM with raw features performs

best on the apple classification task, which exceeds 97

per cent in all evaluation indicators.

This research uses separate KNN and SVM, in

future investigation, further improvement in

classification accuracy can be attempted by using a

hybrid of KNN and SVM. The model was proposed

by Zhang et al., which has reasonable computational

complexity in training and exceptional results in

practice (Zhang et al 2006)

Erro! A origem da referência não foi

encontrada.

. Due to the high similarity of apple images

and poor performance of HOG features, another idea

to improve the model is using wavelet transform to

extract image texture features. It is based on research

by Jiang et al. specifically for the problem of

recognizing and classifying high similarity images in

a specific domain (Jiang et al 2018).0

It should be noted that this research is based on a

small sample dataset, with many images of an apple

species coming from a rotational shot of a single

apple. Not enough training samples may lead to

overfitting of the model. In addition, the classification

of apple species in this research is not so rigorous, and

there may be cases where different classes are treated

as the same class or the same class is treated as

different classes.

CONCLUSION

In current research, SVM with raw features achieved

best results in both training and testing process.

Surprisingly, the KNN model also has a good

performance on this task. Through the research, this

study found that HOG features work less well on high

similarity datasets like Apple image set than using raw

features. This is probably because the individual apple

types are similar in shape and size, differing mainly in

color. This finding could help other researchers to

avoid the use of HOG when classifying images with a

high degree of similarity in shape and size.

Furthermore, the success of SVM with raw features in

Apple Classiﬁcation Based on HOG, KNN and SVM

455

apple classification provides value for the application

of traditional machine learning in agriculture.

REFERENCES

A. Bhargava, and A. Bansal, “Classification and grading of

multiple varieties of apple fruit,” Food. Anal. Method,

vol. 14, pp. 1359-1368, 2021.

A. K. Bhatt, and D. Pant, “Automatic apple grading model

development based on back propagation neural network

and machine vision, and its performance

evaluation,” Ai & Society, vol. 30, no. 1, pp. 45-56,

2015.

R. Misigo, “ Classification of selected apple fruit varieties

using Naive Bayes,” Univ. Nairobi, vol. 3, no. 1, pp. 56,

2016.

X. Guo, Q. Hao and F. Yang, “Apple multi-target detection

method based on improved HOG and SVM,” Foreign

Electronic Measurement Technology, vol. 11, pp. 154-

159, 2022.

F. A. I. A. Putra, F. Utaminingrum, and W. F. Mahmudy,

“HOG feature extraction and KNN classification for

detecting vehicle in the highway,” IJCCS (Indonesian

Journal of Computing and Cybernetics Systems), vol.

14, no. 3, pp. 231-242, 2020.

H. I. Dino, & M. B. Abdulrazzaq, “Facial expression

classification based on SVM, KNN and MLP

classifiers,” In 2019 International Conference on

Advanced Science and Engineering (ICOASE), April

2019, pp. 70-75

L. H. Ci, “Comparative analysis of image classification

algorithms based on traditional machine learning and

deep learning,” Computer and Information Technology,

vol. 05, pp. 12-15, 2019.

N. Dalal and B. Triggs, “Histograms of oriented gradients

for human detection,” in Proc. IEEE Conf. Comput.

Vis. Pattern Recognit., vol. 1, 2005, pp. 886–893.

M. Binkhonain and L. Zhao, “A review of machine learning

algorithms for identification and classification of non-

functional requirements,” Expert Syst. With Appl.: X,

vol. 1, 2019, Art. no. 100001.

V. Vapnik, The Nature of Statistical Learning Theory. New

York, NY, USA: Springer-Verlag, 2013.

M. Çakir, M. Yilmaz, M. A. Oral, H. O. Kazanci, and O.

Oral, “Accuracy assessment of RFerns, NB, SVM, and

kNN machine learning classifiers in

aquaculture,” Journal of King Saud University-Science,

vol. 35, no. 5, Aug. 2023.

X. Y. Zheng, “Research on assisted diagnosis model for

Alzheimer's disease based on multi-classifier fusion,”

PhD dissertation, Harbin Institute of Technology,

Harbin, Heilongjiang, China, 2022.

H. Zhang, A. Berg, M. Maire, and J. Malik, “SVM-KNN:

Discriminative nearest neighbor classification for

visual category recognition,” Proc. IEEE Conf.

Computer Vision and Pattern Recognition, 2006.

W. C. Jiang, H. B. Liu, Y. J. Yang, J. F. Chen, and A. B.

Sun, “A high similarity image recognition and

classification algorithm combining wavelet transform

and convolutional neural network,” Comp. Eng. Sci,

vol. 09, pp. 1646-1652, 2018.

DAML 2023 - International Conference on Data Analysis and Machine Learning

456