Transfer learning strategy is used in this research.
A new model was created by using the weight of the
convolution layer that was pre-trained by VGG16 in
the ImageNet dataset for feature extraction and
customizing the classification layer to adapt the cat
breed classification task. Its architecture is
demonstrated in Figure 1 (Zhuang, 2020).
Unfreezing various convolutional layers may
significantly affect the model's performance on new
tasks since different convolutional layers extract
picture information at varying levels of complexity
and abstraction. This thorough approach to thaw
training will offer deeper understanding of difficult
problems like fine-grained categorization, assisting in
the discovery of more efficient transfer learning
techniques and useful advice. Also, the training
methodology itself will also have an impact on the
outcome. For instance, distinct approaches, such as
immediate unfreezing of Fully Connected (FC) and
convolutional layers in transfer learning and
unfreezing of FC and convolutional layers in stages,
may provide varying results. This work may further
advance the state-of-the-art development in the field
of image classification and make significant progress
toward more precise and effective image
classification technology by refining the model
structure and increasing the sensitivity to minute
characteristics.
2.3 Evaluation Matrices
This study utilizes the confusion matrix, F1 value,
accuracy, recall, and precision as well as the roc curve
indices for validation while measuring the
performance of the model.
3 EXPERIMENT AND RESULT
3.1 Training Details
In the experiment, a lot of hyperparameters were
selected. Initially, batch size is set to 32, meaning that
there are 32 samples in each training batch. This has
benefits for both training speed and memory usage.
Then, target size is set to (224, 224), the input image
will be resized to 224 x 224 pixels in order to comply
with the neural network's need for a fixed size input.
Furthermore, class mode is set to 'categorical,' which
implies that the array of labels formed in one-hot
encoding can be usefully returned for multi-class
classification issues. Additionally, a binary vector
representing each category is created, with only one
element being 1 and the remaining elements being 0,
indicating which category the sample falls into. In
order to help the model better converge to the ideal
solution, this work also utilized Adam as the
optimizer, categorical crossentropy as the loss
function and accuracy as the metric. Besides, early
stopping, a callback function, is utilized to halt model
training early in order to prevent overfitting. This
work trained on Colab and utilized the GPU as a
hardware resource to continuously optimize the
convergence speed and accuracy of the model by
constantly adjusting the learning rate and epoch count.
3.2 Quantitative Performance
There are 499 photos are chosen for each of the five
cat breeds, for a total of 2495 photos, in order to
assess the model's performance. The impact of
varying the number of convolutional layers from deep
to shallow unfreezing on the experimental outcomes
is examined in Table 1 and Table 2.
This experiment examines, in Table 1, the effects
of varying the number of unfreezing layers on model
performance, ranging from 0 to 13 layers. And Table
2 displays the experiment and compares the two
methods of layer unfreezing: sequential unfreezing
(SEQ) and simultaneous unfreezing (SIM). Whereas
the SEQ technique unfreezes the fully connected (FC)
layer first and then unfreezes the last six
convolutional layers, the SIM strategy unfreezes six
convolutional layers and fully connected layers at
once. Several performance metrics were employed to
assess the results of these two experiments.
This study further validated the performance of the
best trained model with 6 unfreezed convolutional
layers using the confusion matrix, and obtained the
following Figure 2 and Figure 3 to further evaluate
and visualize the simultaneous unfreezing (SIM) and
sequential unfreezing (SEQ) result in Table1.
Table 1: Performance using different number of
unfreezing layers.
ACC recall