Explainability-Driven Leaf Disease Classiﬁcation Using Adversarial

Training and Knowledge Distillation

Sebastian-Vasile Echim

1,∗

, Iulian-Marius T

aiatu

1,∗

, Dumitru-Clementin Cercel

1,a

and Florin Pop

1,2,3

Computer Science and Engineering Department, National University of Science and Technology POLITEHNICA Bucharest,

Romania

National Institute for Research and Development in Informatics, ICI Bucharest, Romania

Academy of Romanian Scientists, Bucharest, Romania

Keywords:

Adversarial Attacks, Adversarial Defense, Explainable AI, Knowledge Distillation.

Abstract:

This work focuses on plant leaf disease classiﬁcation and explores three crucial aspects: adversarial training,

model explainability, and model compression. The models’ robustness against adversarial attacks is enhanced

through adversarial training, ensuring accurate classiﬁcation even in the presence of threats. Leveraging

explainability techniques, we gain insights into the model’s decision-making process, improving trust and

transparency. Additionally, we explore model compression techniques to optimize computational efﬁciency

while maintaining classiﬁcation performance. Through our experiments, we determine that on a benchmark

dataset, the robustness can be the price of the classiﬁcation accuracy with performance reductions of 3%-20%

for regular tests and gains of 50%-70% for adversarial attack tests. We also demonstrate that a student model

can be 15-25 times more computationally efﬁcient for a slight performance reduction, distilling the knowledge

of more complex models.

1 INTRODUCTION

Artiﬁcial intelligence (AI) has provided the theoreti-

cal knowledge to develop real-world applications for

decades. The amalgamation of enhanced computa-

tional capacities, reﬁned learning algorithms, and in-

creased data accessibility has propelled signiﬁcant ad-

vances in machine learning, resulting in the use in

many domains (Linardatos et al., 2021). Furthermore,

the advancement of deep learning techniques, demon-

strated by their ongoing reﬁnement, has reinforced

their pivotal role in the AI landscape (He et al., 2016;

Khan et al., 2022).

Nevertheless, although deep neural networks

(DNNs) can attain best performance for various tasks

in domains such as legal (Sm

adu et al., 2022), energy

ast

asescu and Cercel, 2022), and medicine (Lungu-

Stan et al., 2023), it has been demonstrated that these

networks are prone to classiﬁcation errors for data

with imperceptible noise (Szegedy et al., 2014; Good-

fellow et al., 2015). Thus, the network sensitivity to

perturbations is explored using adversarial algorithms

(Goodfellow et al., 2015; Kurakin et al., 2017). They

are used in network attacks, robustness improvements,

Corresponding author

Equal contributions.

or data augmentations to achieve better performance

with adversarial training (Miyato et al., 2016).

The prediction accuracy is frequently attained by

elevating DNN architecture complexity (Linardatos

et al., 2021; Christian Meske and Gersch, 2022). How-

ever, boosting a model’s predictive power reduces its

ability to explain its inner workings and mechanisms.

Consequently, the rationale behind their decisions be-

comes quite challenging to understand, and, therefore,

their predictions take time to interpret. Considerable

emphasis has been placed on elucidating the explain-

ability of deep learning, particularly in image analysis,

speciﬁcally focusing on the concept of saliency maps

(Ramirez-Moreno et al., 2013; Simonyan et al., 2014).

We conduct a series of experiments on a publicly

available dataset (Geetharamani and Pandian, 2019) to

better understand DNNs and their performance in the

context of adversarial attacks and defenses (Szegedy

et al., 2014), explainability (Simonyan et al., 2014),

and knowledge distillation (Bucilu

a et al., 2006). We

bring in this work the following main contributions:

•

We demonstrate the performance impact on DNNs

when extra training data is generated with adver-

sarial algorithms for achieving better robustness.

•

We indicate the effectiveness of adversarial attacks

on models not trained with adversarial data and

782

Echim, S., T

aiatu, I., Cercel, D. and Pop, F.

Explainability-Driven Leaf Disease Classiﬁcation Using Adversarial Training and Knowledge Distillation.

DOI: 10.5220/0012392900003636

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 16th International Conference on Agents and Artiﬁcial Intelligence (ICAART 2024) - Volume 3, pages 782-791

ISBN: 978-989-758-680-4; ISSN: 2184-433X

the defense capability when adversarial training is

also performed.

•

We show that different shapes and orientations of

a plant leaf to be classiﬁed can determine different

results when using adversarial attacks.

•

We offer a closer look into the attentional patterns,

demonstrating that models can have signiﬁcant fo-

cus shifts away from the main classiﬁcation points

of interest when attacked.

•

We ﬁnd a good trade-off solution between accu-

racy and efﬁciency using knowledge distillation,

obtaining only 13-15% classiﬁcation accuracy re-

duction for models 2-14 times smaller and 15-25

times more efﬁcient.

The paper is structured in six sections, starting with

the introduction and the related work. The third sec-

tion describes the dataset characteristics. The fourth

section covers the methods used in our experiments,

indicating the network architectures employed, the ad-

versarial algorithms applied, the performance metrics

used to analyze the results, and the main research ques-

tions on which we based our experiments. Finally, the

work covers the result analysis in Section 5 and closes

with a conclusion section.

2 RELATED WORK

Geetharamani and Pandian (Geetharamani and Pan-

dian, 2019) proposed a leaf disease identiﬁcation

model based on DNNs. The authors train their model

using a dataset with 39 different classes of plant leaves.

This predictive model achieved impressive results,

boasting a 96.46% accuracy in classiﬁcation. This per-

formance surpasses classical machine learning base-

lines. They anticipate even better results with a more

complex model incorporating adversarial training. In

this experimental area, You et al. (You et al., 2023)

evaluated a set of adversarial algorithms to attack neu-

ral networks such as ResNet50 (He et al., 2016) and

VGG16. Moreover, using the adversarial examples as

data augmentation, the authors achieved performance

improvements.

Recently, knowledge distillation has become a sig-

niﬁcant strategy in plant leaf disease recognition, fa-

cilitating the creation of compact and practical mod-

els. For example, Ghofrani et al. (Ghofrani and Mah-

dian Toroghi, 2022) introduced a client-server system

based on the knowledge distillation technique to en-

hance the client’s accuracy. In their approach, a robust

deep convolutional neural network (CNN) (Kim, 2014)

architecture was responsible for classifying diseases

on the server side. In contrast, on the client side, a

lightweight deep CNN architecture facilitated classiﬁ-

cation on mobile devices.

Musa et al. (Musa et al., 2022) explored efﬁ-

cient mobile solutions for AI in agriculture, imple-

menting a low-power DNN for plant disease diagnosis

using knowledge distillation. Their teacher and stu-

dent models are based on the CNN architecture, fea-

turing around 52M parameters for the teacher model

and around 1.2M for the student. The system was

deployed on a Raspberry Pi device, achieving a 90%

power reduction, with a maximum of 6.22W, and a

top performance of plant disease detection of 99.4%

classiﬁcation accuracy. Likewise, Huang et al. (Huang

et al., 2023) tackled the issue of creating small plant

disease classiﬁcation systems based on knowledge dis-

tillation, applicable across various crops. Their solu-

tion, called multistage knowledge distillation, upholds

performance and reduces the model size. Furthermore,

the authors underscored the possibility of extending

this technique to diverse computer vision tasks, such

as image classiﬁcation and segmentation, enabling

the development of automated plant disease detection

models with broader utility in precision agriculture.

3 DATASET

In this paper, we employ a leaf disease dataset

(Geetharamani and Pandian, 2019) consisting of im-

ages of the leaves of different plants such as apple,

cherry, corn, grape, pepper bell, and potato (38 dif-

ferent classes). Moreover, the dataset introduces an

extra "false" class, covering several landmark photos

without leaves.

The dataset contains 55,448 samples with RGB,

256x256 pixels each. As shown in Figure 1, the dataset

is highly imbalanced, with sizes from 100 to over

5,500 per class, thus imposing a classiﬁcation chal-

lenge due to various leaf disease types. The classes

with the most signiﬁcant training footprint are Orange

with Huanglongbing, Healthy soybean, and Tomato

with yellow leaf curl virus. We expect a better clas-

siﬁcation accuracy for these categories. In contrast,

classes such as Healthy potato, Healthy raspberry, and

Apple with cedar apple rust will most likely be un-

favoured classiﬁcation-wise.

4 METHODOLOGY

4.1 Model

The architecture consists of a regular CNN model as a

feature extractor built based on different ﬁlters with (3,

Explainability-Driven Leaf Disease Classiﬁcation Using Adversarial Training and Knowledge Distillation

783

Figure 1: Class distribution of the leaf disease dataset.

32), (32, 64), (64, 128), (128, 256) features with ReLU

activations and max-pooling, as well as a classiﬁer

represented by two fully-connected layers with 128

inputs to 39 outputs, that is, the number of classes for

our dataset. The optimizer is Adam (Kingma and Ba,

2014), whose learning rate is 5e-4 with linear decay,

while the loss function is the cross-entropy.

4.2 Adversarial Training and Attack

Regarding adversarial attacks and training, several al-

gorithms were introduced in our experiments (see Ta-

ble 1). We use these gradient-based algorithms to

test the performance of the CNN model under sev-

eral adversarial conﬁgurations. The experiments per-

formed in this paper include regular and adversarial

training and testing adversarial attack success analysis

(Su et al., 2018). For the classiﬁcation experiments,

we introduce four scenarios: testing against a model

trained only on regular data, using adversarial test ex-

amples for a model, and checking performance with

adversarial training using either regular or adversarial

test examples.

4.3 Model Explainability

Introduced by Simonyan et al. (Simonyan et al., 2014),

the saliency explanation mechanism is an attribution

method based on gradients. In this method, each gra-

dient denotes the impact of a change in each input

dimension on predictions within a small neighborhood

around the input. As a result, the technique calcu-

lates a class-level saliency map for an image based

on the gradient of an output neuron concerning the in-

put. This map highlights the areas in the input image,

representing the focus regions for the speciﬁed class.

Guided backpropagation (Springenberg et al.,

2015), alternatively referred to as guided saliency, em-

ploys a novel deconvolution variant (Zeiler and Fergus,

2014) to visualize features learned by various DNN

architectures. When applied to small images, using

max-pooling in CNNs becomes questionable. Replac-

ing the max-pooling layers with a convolutional layer

featuring an increased stride demonstrates no loss of

performance on several image processing tasks, un-

derscoring the versatility of guided backpropagation

in challenging the conventional use of max-pooling

within CNNs for speciﬁc image sizes.

Gradient-weighted class activation mapping (Grad-

CAM), introduced by Selvaraju et al. (Selvaraju et al.,

2017), addresses a fundamental limitation of class acti-

vation mapping (CAM) (Zhou et al., 2015) by offering

visual explanations for any CNN, irrespective of its

architecture. This gradient-based method utilizes class-

speciﬁc gradient information from the convolutional

layer of a CNN to generate a localization map, high-

lighting the visual attention regions in an image for

classiﬁcation and enhancing the transparency of CNN-

based models. Despite its advancements over CAM,

GradCAM has certain limitations, such as not being

able to localize multiple occurrences of an entity in an

image effectively (Chattopadhay et al., 2018).

We interpret the visual performance of the mod-

els trained in different setups with visualization meth-

ods such as t-SNE (Hinton and Roweis, 2002) for

highlighting the classiﬁer capabilities through two-

dimensional clustering, GradCAM, Guided GradCAM

(Selvaraju et al., 2017), and Guided GradCAM HiRes

(Draelos and Carin, 2021). Thus, we also determine

the saliency maps linked to the output of the feature

extractor.

4.4 Knowledge Distillation

A practical method for compressing models is knowl-

edge distillation, which facilitates information transfer

from a large teacher network to a small student net-

work. Initially introduced by Bucila et al. (Bucilu

et al., 2006) and later generalized by Hinton et al.

(Hinton et al., 2015), this method has gained popu-

larity across multiple machine learning applications.

Unlike conventional training, knowledge distillation

involves learning the student network to emulate the

outputs of the teacher model, typically represented as

probability distributions over classes. By minimizing

the similarity loss between the student’s predictions

and the teacher’s probabilities, the student network can

assimilate the knowledge from the teacher, resulting

in superior performance compared to training from

ICAART 2024 - 16th International Conference on Agents and Artiﬁcial Intelligence

784

Table 1: Adversarial algorithms used for model defense and attack experiments.

Algorithm Formulas

FGSM (Goodfellow et al., 2015) x

= x + ε · sgn(∇

`( f (x), y))

RFGSM (Tramèr et al., 2018)

= x + α · sgn(N (0

))

t+1

= clip

(x,ε)

+ (ε − α) · sgn



∇





o

FFGSM (Wong et al., 2020)

= x + U(−ε, ε)

= Π

B(x,ε)

+ α · sgn



∇





o

MIFGSM (Dong et al., 2017)

t+1

= µ · g

∇

`( f (x

),y)



∇

`( f (x

),y)



t+1

= Π

B(x,ε)



+ α · sgn(g

t+1

)



BIM (Kurakin et al., 2017) x

t+1

= clip

(x,ε)

+ α · sgn



∇

`( f (x

),y)

o

PGD (Madry et al., 2018)

= x + U(−ε, ε)

t+1

= Π

B(x,ε)

+ α · sgn



∇





o

TPGD (Zhang et al., 2019)

= x + 0.001· N (0

)

t+1

= Π

B(x,ε)

+ α · sgn



∇



(x), f





o

EOTPGD (Liu et al., 2018) x

t+1

= Π

B(x,ε)

+ α · sgn



∇

`( f (x

),y)

io

scratch without a teacher (Avram et al., 2022).

The architecture of our student consists of mul-

tiple stacked convolutional layers with interleaved

pooling layers and ReLU activation functions. The

classiﬁcation head of the student model comprises a

fully-connected layer followed by a dropout layer with

a dropout rate of 50%. The utilization of dropout

aids in enhancing the generalization capabilities of

the model. The Adam optimizer is adopted for con-

ducting the knowledge distillation experiments. Three

teacher models were selected for our goals: ResNet50,

DenseNet121 (Huang et al., 2017), and Xception

(Chollet, 2017). Additionally, an ensemble approach

is employed to leverage the knowledge acquired by

these pre-trained teachers on the ImageNet benchmark

(Deng et al., 2009). Before beginning the effective

distillation process, each teacher model is ﬁne-tuned

on the leaf disease classiﬁcation task.

For the distillation process, the hyperparameters

are crucial in controlling the knowledge transfer from

the teacher models to the student model. Their values

are chosen as follows: the alpha parameter is set to 0.1,

and the temperature parameter is set to 3.

4.5 Performance Metrics

We use accuracy and macro-F1 as performance met-

rics for adversarial training and attack experiments.

Knowledge distillation experiments are evaluated us-

ing accuracy and macro-F1 for classiﬁcation, the num-

ber of model parameters for model size comparison,

and ﬂoating-point operations per second (FLOPs) for

model efﬁciency evaluation.

4.6 Research Questions

[RQ1] Is robustness the cost of accuracy?

[RQ2] How is the classiﬁcation affected when we use

adversarial examples?

[RQ3] Does the leaf shape matter?

[RQ4] How do adversarial attacks change focus?

[RQ5] Is knowledge distillation effective for leaf dis-

ease classiﬁcation?

5 RESULTS

5.1 Results for RQ1

Table 2 shows outstanding performance for the regular

scenario, without adversarial training and adversar-

ial attack, with 97.78% accuracy. In contrast, when

we attack the network, the actual label of adversarial

examples is hardly identiﬁed, thus a lower accuracy

ranging from 0.78% to 18.58%.

When we train the network with adversarial exam-

ples, we observe the trade-off between robustness and

accuracy (Su et al., 2018). Table 3 shows that for a

network trained on adversarial examples and tested on

the initial data, we determine lower accuracy for all the

Explainability-Driven Leaf Disease Classiﬁcation Using Adversarial Training and Knowledge Distillation

785

models, ranging from 77.17% to 94.70%. In the attack

scenario for adversarially trained models, we gain a

robustness improvement for all algorithms. BIM is the

best adversarial algorithm in both types of tests, with

94.70% accuracy for initial test data and 83.46% for

adversarial test data.

Table 2: Accuracy (Acc) and macro-F1 (F1) for our CNN

model without adversarial training.

Attack Algorithm Acc (%) F1 (%)

No - 97.75 97.03

Yes FGSM 5.64 4.69

RFGSM 0.78 1.17

FFGSM 6.02 4.99

MIFGSM 0.11 0.17

BIM 10.95 9.24

PGD 0.51 0.76

TPGD 18.58 17.81

EOTPGD 0.81 1.20

Table 3: Accuracy (Acc) and macro-F1 (F1) for our CNN

model with adversarial training.

Attack Algorithm Acc (%) F1 (%)

No FGSM 79.98 70.06

RFGSM 83.99 75.52

FFGSM 83.46 74.89

MIFGSM 77.17 66.84

BIM 94.70 91.86

PGD 83.52 74.09

TPGD 83.37 77.36

EOTPGD 84.36 76.53

Yes FGSM 56.12 41.93

RFGSM 60.79 47.53

FFGSM 61.06 47.32

MIFGSM 51.29 36.92

BIM 83.46 76.58

PGD 51.31 39.79

TPGD 64.76 57.78

EOTPGD 60.99 47.93

5.2 Results for RQ2

Figure 2 depicts the classiﬁcation performance for the

models as a 2D projection. We see a clear separation

for the model trained and tested on initial data. More-

over, in contrast with the other adversarial models, the

t-SNE plot also highlights improved performance for

the BIM algorithm. Classes Orange with Huanglong-

bing (brown), Healthy soybean (lavender), and Tomato

with yellow leaf curl virus (blue) have the most con-

siderable volume in this dataset. Despite the correct

classiﬁcation, we can observe that their distribution

is highly affected by adversarial training, even in the

case of BIM, where the Tomato with yellow leaf curl

virus is split into two separate clusters. We can still

determine a slight separation for this class in regular

training and testing scenarios. Thus, we conclude that

adversarial training signiﬁcantly affects class distribu-

tions.

5.3 Results for RQ3

Figures 3, 4, 5, and 6 support an analysis focused on

the classiﬁcation of one image in all adversarial sce-

narios, providing a view on the saliency map, the test

image either regular or adversarial, as well as the super-

imposition of them for better visualization. An initial

observation is that maps for adversarial examples often

describe the shape of the leaves. However, in addition

to highlighting the shape and potential focus points,

they feature additional heat in the regions outside the

leaves due to the introduced noise. Figure 4 depicts an

example where we get four misclassiﬁcations out of

nine despite the similarities in shapes.

Figure 6a attempts to explain the Strawberry with

leaf scorch classiﬁcation to the expected Squash with

powdery mildew. The shapes of the saliency map

match in signiﬁcant proportion; therefore, the round

features might have determined the confusion. This

impact is even more evident in Figure 5, with only

three of nine correctly classiﬁed examples. A Peach

with bacterial spot example is shown in Figure 6b,

featuring a similar shape and orientation to the Tomato

with septoria leaf spot. We conclude that the shape and

orientation of the leaves in the dataset can be prone to

misclassiﬁcations when they are subject to adversarial

attacks.

5.4 Results for RQ4

The analysis of the model’s cognitive processes dur-

ing classiﬁcation encompasses a comprehensive explo-

ration of three distinct visualization techniques (i.e.,

GradCAM, Guided GradCAM, and Guided GradCAM

HiRes), each designed to shed light on the intricate

mechanisms governing its decision-making. Figure 7

presents these visualization methods aimed at elucidat-

ing the underlying reasoning employed by the model

when faced with a given instance for classiﬁcation.

In particular, the second row of visualizations of-

fers valuable insights into the model’s attentional pat-

terns when confronted with an un-attacked image. This

ICAART 2024 - 16th International Conference on Agents and Artiﬁcial Intelligence

786

Figure 2: Class t-SNE distribution for the models trained with adversarial training. The ﬁrst subﬁgure, labeled "CNN", depicts

the classiﬁcations of the model trained without adversarial examples, while subﬁgures 2-9 show the classiﬁcation clustering

for the models using the FGSM, RFGSM, FFGSM, MIFGSM, BIM, PGD, TPGD, and EOTPGD algorithms to create more

training data.

investigation unveils crucial information regarding the

areas of interest within the input image that the model

prioritizes when making its predictions.

Furthermore, the last row of visualizations is a

pivotal component of this analysis, seeking to reveal

the precise focal points that capture the model’s atten-

tion during the classiﬁcation process. In this example,

the FGSM attack demonstrates its efﬁciency by skill-

fully diverting the model’s focus away from signiﬁcant

points within the leaf where the disease is present.

The ability of the FGSM attack to redirect the

model’s focus towards less critical regions of the image

highlights potential vulnerabilities in the model’s ro-

bustness and susceptibility to adversarial perturbations.

Such insights are essential contributions to model ex-

plainability and adversarial machine learning, facilitat-

ing the development of more robust and dependable

AI systems for critical applications.

5.5 Results for RQ5

Before Knowledge Distillation.

Table 4 presents the

performance of the models before the knowledge dis-

tillation process. The student model achieved an ac-

curacy of 85.38% and a macro-F1 score of 79.17%.

In comparison, the teacher models, namely ResNet50,

DenseNet121, Xception, and Ensemble, displayed sig-

niﬁcantly higher accuracies and macro-F1 scores, rang-

ing from 97.16% to 99.04%.

After Knowledge Distillation.

Table 5 depicts the

results after the knowledge distillation process. The

student model demonstrates accuracy and macro-F1

score improvements when distilled with each teacher

model. When distilled with ResNet50, the student

model achieved an accuracy of 89.43% (+4.05) and

a macro-F1 score of 85.09% (+5.92). Similarly, dis-

tillation with DenseNet121 and Xception resulted in

accuracies of 90.11% (+4.73) and 88.88% (+3.50),

as well as macro-F1 scores of 85.23% (+6.06) and

83.42% (+4.25). Furthermore, distillation with the En-

semble model led to an accuracy of 89.72% (+4.34)

and a macro-F1 score of 84.95% (+5.78).

Number of Parameters.

Table 6 outlines the num-

ber of parameters for each model, both the student

and teacher models. The student model contains only

3.7M parameters. In comparison, the teacher models

have signiﬁcantly more parameters, ranging from 7M

Explainability-Driven Leaf Disease Classiﬁcation Using Adversarial Training and Knowledge Distillation

787

Figure 3: Squash GradCAM saliency maps without adversarial attack.

Figure 4: Squash GradCAM saliency maps with adversarial attack.

Figure 5: Tomato GradCAM saliency maps with adversarial attack.

(a) Strawberry classiﬁcation GradCAM. (b) Peach classiﬁcation GradCAM.

Figure 6: Misclassiﬁcation GradCAM for different leaf diseases.

ICAART 2024 - 16th International Conference on Agents and Artiﬁcial Intelligence

788

Figure 7: Model’s attentional focus after the FGSM attack.

for DenseNet121 to 51.6M for the Ensemble (compris-

ing the parameters from ResNet50, DenseNet121, and

Xception). Despite the knowledge transfer from the

teacher models during distillation, the student model

maintains a substantially lower number of parameters

than the teacher models. Surprisingly, the Ensemble

does not obtain the best enhancement despite the no-

table improvements achieved through the knowledge

distillation process with various teacher models.

Model Computational Efﬁciency.

The computa-

tional efﬁciency of the models, as indicated by the

FLOPs for each one, is shown in Table 7. The student

model achieves 0.4B FLOPs. In contrast, the teacher

models perform signiﬁcantly more FLOPs, ranging

from 7.4B for DenseNet121 to 11.9B for Xception.

Even with the knowledge transfer from the teacher

models during the distillation process, the student

model retains its computational efﬁciency by operating

at fewer FLOPs than the teacher models.

The results demonstrate the effectiveness of the

knowledge distillation process, where the student

model exhibits improved performance while acquiring

knowledge from the teacher models. Moreover, the

student model achieves these performance gains while

keeping its parameters and computational complexity

low, making it an efﬁcient and effective solution for

the leaf disease classiﬁcation.

Student’s Performance.

To gain insights into the

student’s learning progress from the teacher, we uti-

lize t-SNE visualizations for both the teacher and the

student after the knowledge distillation process. The

2D visualizations, showcased in Figure 8, allow us to

compare their abilities in understanding the data and

clustering the leaves effectively.

The teacher demonstrates a higher capacity to com-

prehend the data and adeptly cluster the leaves. On

the other hand, the student’s performance is commend-

able, having learned to capture a substantial amount

of knowledge from the teacher, enabling it to group

the leaves accurately. However, due to the limited

capacity, the student overlaps some clusters slightly.

(a) Teacher (b) Student

Figure 8: t-SNE visualizations after knowledge distillation.

Table 4: Performance before knowledge distillation.

Model Accuracy (%) Macro-F1 (%)

Student 85.38 79.17

ResNet50 99.04 98.50

DenseNet121 98.55 97.99

Xception 97.16 96.34

Ensemble 98.73 98.11

Table 5: Student’s accuracy (Acc) and macro-F1 (F1) after

knowledge distillation.

Teacher Acc (%) ∆ Acc F1 (%) ∆ F1

ResNet50 89.43 +4.05 85.09 +5.92

DenseNet121 90.11 +4.73 85.23 +6.06

Xception 88.88 +3.50 83.42 +4.25

Ensemble 89.72 +4.34 84.95 +5.78

Table 6: Comparison of student size with teacher models.

The number of parameters is written in millions. The En-

semble model parameters are represented as the sum of the

ResNet50, DenseNet121, and Xception number of parame-

ters.

Model #Params % of Teacher Params

Student 3.71 -

ResNet50 23.66 15.68

DenseNet121 7.06 52.54

Xception 20.93 17.72

Ensemble 51.65 7.18

6 CONCLUSIONS

In this work, we addressed three vital pillars in ma-

chine learning: adversarial attacks, knowledge distil-

lation, and explainability, explicitly focusing on plant

Explainability-Driven Leaf Disease Classiﬁcation Using Adversarial Training and Knowledge Distillation

789

Table 7: Comparison of student computational efﬁciency

with teacher models. The FLOPs values are written in giga-

bytes. The Ensemble model FLOPs is the maximum value

of the ResNet50, DenseNet121, and Xception FLOPs.

Model FLOPs % of Teacher FLOPs

Student 0.48 -

ResNet50 10.09 4.75

DenseNet121 7.45 6.44

Xception 11.90 4.03

Ensemble 11.90 4.03

leaf disease classiﬁcation. Thus, we explored adver-

sarial attacks and their impact on the robustness of

CNNs. Through rigorous experimentation, we suc-

cessfully employed adversarial training to enhance the

resilience of the CNNs against adversarial perturba-

tions, reinforcing their ability to perform accurate leaf

disease classiﬁcation.

In machine learning explainability, we recognized

the critical importance of understanding how models

arrive at their decisions, especially in the context of

leaf disease classiﬁcation. Additionally, we ventured

into the domain of model compression, aiming to re-

tain the performance of the leaf disease classiﬁcation

models while reducing their computational complexity.

Our ﬁndings demonstrated promising results, paving

the way for deploying efﬁcient and resource-friendly

models, particularly crucial for applications with lim-

ited computing resources.

The following steps in our experiments include

how successful adversarial attacks of an adversarial al-

gorithm are on a model previously trained on examples

generated with a different one. The quest for a bet-

ter understanding of the inner workings of knowledge

distillation and its interplay with other regularization

techniques also opens up intriguing directions for fu-

ture exploration.

ACKNOWLEDGEMENTS

This research has been funded by the National Uni-

versity of Science and Technology POLITEHNICA

Bucharest through the PubArt program.

REFERENCES

Avram, A.-M., Catrina, D., Cercel, D.-C., Dascalu, M., Rebe-

dea, T., P

, V., and Tuﬁ¸s, D. (2022). Distilling the

knowledge of romanian berts using multiple teachers.

In Proceedings of the Thirteenth Language Resources

and Evaluation Conference, pages 374–384.

Bucilu

a, C., Caruana, R., and Niculescu-Mizil, A. (2006).

Model compression. In Proceedings of the 12th ACM

SIGKDD international conference on Knowledge dis-

covery and data mining, pages 535–541.

Chattopadhay, A., Sarkar, A., Howlader, P., and Balasub-

ramanian, V. N. (2018). Grad-cam++: Generalized

gradient-based visual explanations for deep convolu-

tional networks. In 2018 IEEE Winter Conference on

Applications of Computer Vision, pages 839–847.

Chollet, F. (2017). Xception: Deep learning with depthwise

separable convolutions. In Proceedings of the IEEE

conference on computer vision and pattern recognition,

pages 1251–1258.

Christian Meske, Enrico Bunde, J. S. and Gersch, M. (2022).

Explainable artiﬁcial intelligence: Objectives, stake-

holders, and future research opportunities. Information

Systems Management, 39(1):53–63.

Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-Fei,

L. (2009). Imagenet: A large-scale hierarchical image

database. In 2009 IEEE conference on computer vision

and pattern recognition, pages 248–255. Ieee.

Dong, Y., Liao, F., Pang, T., Hu, X., and Zhu, J. (2017). Dis-

covering adversarial examples with momentum. CoRR,

abs/1710.06081.

Draelos, R. L. and Carin, L. (2021). Use hirescam instead

of grad-cam for faithful explanations of convolutional

neural networks.

Geetharamani, G. and Pandian, A. (2019). Identiﬁcation

of plant leaf diseases using a nine-layer deep convolu-

tional neural network. Computers & Electrical Engi-

neering, 76:323–338.

Ghofrani, A. and Mahdian Toroghi, R. (2022). Knowledge

distillation in plant disease recognition. Neural Com-

puting and Applications, 34(17):14287–14296.

Goodfellow, I. J., Shlens, J., and Szegedy, C. (2015). Explain-

ing and harnessing adversarial examples. In Bengio,

Y. and LeCun, Y., editors, 3rd International Confer-

ence on Learning Representations, ICLR 2015, San

Diego, CA, USA, May 7-9, 2015, Conference Track

Proceedings.

He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep Resid-

ual Learning for Image Recognition. In Proceedings

of 2016 IEEE Conference on Computer Vision and Pat-

tern Recognition, CVPR ’16, pages 770–778. IEEE.

Hinton, G. E. and Roweis, S. (2002). Stochastic neighbor

embedding. In Becker, S., Thrun, S., and Obermayer,

K., editors, Advances in Neural Information Process-

ing Systems, volume 15. MIT Press.

Hinton, G. E., Vinyals, O., and Dean, J. (2015). Dis-

tilling the knowledge in a neural network. CoRR,

abs/1503.02531.

Huang, G., Liu, Z., Van Der Maaten, L., and Weinberger,

K. Q. (2017). Densely connected convolutional net-

works. In Proceedings of the IEEE conference on

computer vision and pattern recognition, pages 4700–

4708.

Huang, Q., Wu, X., Wang, Q., Dong, X., Qin, Y., Wu, X.,

Gao, Y., and Hao, G. (2023). Knowledge distillation

facilitates the lightweight and efﬁcient plant diseases

detection model. Plant Phenomics, 5:0062.

ICAART 2024 - 16th International Conference on Agents and Artiﬁcial Intelligence

790

Khan, A., Vibhute, A. D., Mali, S., and Patil, C. (2022). A

systematic review on hyperspectral imaging technol-

ogy with a machine and deep learning methodology

for agricultural applications. Ecological Informatics,

69:101678.

Kim, Y. (2014). Convolutional neural networks for sentence

classiﬁcation. In Proceedings of the 2014 Conference

on Empirical Methods in Natural Language Processing

(EMNLP). Association for Computational Linguistics.

Kingma, D. P. and Ba, J. (2014). Adam: A

method for stochastic optimization. arXiv preprint

arXiv:1412.6980.

Kurakin, A., Goodfellow, I. J., and Bengio, S. (2017). Adver-

sarial examples in the physical world. In 5th Interna-

tional Conference on Learning Representations, ICLR

2017, Toulon, France, April 24-26, 2017, Workshop

Track Proceedings. OpenReview.net.

Linardatos, P., Papastefanopoulos, V., and Kotsiantis, S.

(2021). Explainable ai: A review of machine learning

interpretability methods. Entropy, 23(1).

Liu, X., Li, Y., Wu, C., and Hsieh, C.-J. (2018). Adv-bnn:

Improved adversarial defense through robust bayesian

neural network. In International Conference on Learn-

ing Representations.

Lungu-Stan, V.-C., Cercel, D.-C., and Pop, F. (2023).

Skindistilvit: Lightweight vision transformer for skin

lesion classiﬁcation. In International Conference on

Artiﬁcial Neural Networks, pages 268–280. Springer.

Madry, A., Makelov, A., Schmidt, L., Tsipras, D., and Vladu,

A. (2018). Towards deep learning models resistant to

adversarial attacks. In International Conference on

Learning Representations.

Miyato, T., Dai, A. M., and Goodfellow, I. (2016). Adver-

sarial training methods for semi-supervised text classi-

ﬁcation. arXiv preprint arXiv:1605.07725.

Musa, A., Hassan, M., Hamada, M., and Aliyu, F. (2022).

Low-power deep learning model for plant disease de-

tection for smart-hydroponics using knowledge distil-

lation techniques. Journal of Low Power Electronics

and Applications, 12(2).

ast

asescu, G.-S. and Cercel, D.-C. (2022). Conditional

wasserstein gan for energy load forecasting in large

buildings. In 2022 International Joint Conference on

Neural Networks (IJCNN), pages 1–8. IEEE.

Ramirez-Moreno, D., Schwartz, O., and Ramirez-Villegas,

J. (2013). A saliency-based bottom-up visual atten-

tion model for dynamic scenes analysis. Biological

cybernetics, 107.

Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R.,

Parikh, D., and Batra, D. (2017). Grad-cam: Visual

explanations from deep networks via gradient-based

localization. In 2017 IEEE International Conference

on Computer Vision (ICCV), pages 618–626.

Simonyan, K., Vedaldi, A., and Zisserman, A. (2014). Visu-

alising image classiﬁcation models and saliency maps.

Deep Inside Convolutional Networks, 2.

adu, R.-A., Dinic

a, I.-r., Avram, A.-M., Cercel, D.-C.,

Pop, F., and Cercel, M.-C. (2022). Legal named entity

recognition with multi-task domain adaptation. In Pro-

ceedings of the Natural Legal Language Processing

Workshop 2022, pages 305–321.

Springenberg, J. T., Dosovitskiy, A., Brox, T., and Ried-

miller, M. A. (2015). Striving for simplicity: The all

convolutional net. In Bengio, Y. and LeCun, Y., editors,

3rd International Conference on Learning Representa-

tions, ICLR 2015, San Diego, CA, USA, May 7-9, 2015,

Workshop Track Proceedings.

Su, D., Zhang, H., Chen, H., Yi, J., Chen, P.-Y., and Gao,

Y. (2018). Is robustness the cost of accuracy? – a

comprehensive study on the robustness of 18 deep

image classiﬁcation models. In Ferrari, V., Hebert, M.,

Sminchisescu, C., and Weiss, Y., editors, Computer

Vision – ECCV 2018, pages 644–661, Cham. Springer

International Publishing.

Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Er-

han, D., Goodfellow, I. J., and Fergus, R. (2014). In-

triguing properties of neural networks. In Bengio, Y.

and LeCun, Y., editors, 2nd International Conference

on Learning Representations, ICLR 2014, Banff, AB,

Canada, April 14-16, 2014, Conference Track Proceed-

ings.

Tramèr, F., Kurakin, A., Papernot, N., Goodfellow, I. J.,

Boneh, D., and McDaniel, P. D. (2018). Ensemble

adversarial training: Attacks and defenses. In 6th In-

ternational Conference on Learning Representations,

ICLR 2018, Vancouver, BC, Canada, April 30 - May 3,

2018, Conference Track Proceedings. OpenReview.net.

Wong, E., Rice, L., and Kolter, J. Z. (2020). Fast is better

than free: Revisiting adversarial training. In 8th In-

ternational Conference on Learning Representations,

ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020.

OpenReview.net.

You, H., Lu, Y., and Tang, H. (2023). Plant disease classiﬁ-

cation and adversarial attack using simam-efﬁcientnet

and gp-mi-fgsm. Sustainability, 15(2).

Zeiler, M. D. and Fergus, R. (2014). Visualizing and under-

standing convolutional networks. In Fleet, D. J., Pajdla,

T., Schiele, B., and Tuytelaars, T., editors, Computer Vi-

sion - ECCV 2014 - 13th European Conference, Zurich,

Switzerland, September 6-12, 2014, Proceedings, Part

I, volume 8689 of Lecture Notes in Computer Science,

pages 818–833. Springer.

Zhang, H., Yu, Y., Jiao, J., Xing, E. P., Ghaoui, L. E., and

Jordan, M. I. (2019). Theoretically principled trade-

off between robustness and accuracy. In Chaudhuri,

K. and Salakhutdinov, R., editors, Proceedings of the

36th International Conference on Machine Learning,

ICML 2019, 9-15 June 2019, Long Beach, California,

USA, volume 97 of Proceedings of Machine Learning

Research, pages 7472–7482. PMLR.

Zhou, B., Khosla, A., Lapedriza, À., Oliva, A., and Torralba,

A. (2015). Learning deep features for discriminative lo-

calization. 2016 IEEE Conference on Computer Vision

and Pattern Recognition (CVPR), pages 2921–2929.

Explainability-Driven Leaf Disease Classiﬁcation Using Adversarial Training and Knowledge Distillation

791