When comparing ViT to Efficient Net, we
observe that Efficient Net performs better on Tomato
Early blight, achieving 159 out of 198 correct
classifications (80.3%), compared to ViT’s 85 out of
198 (42.9%). Similarly, Efficient Net classifies
Tomato Late blight with 366 out of 379 accuracy
(96.5%), whereas ViT achieves 296 out of 379
(78.1%), highlighting a noticeable drop in ViT’s
performance for these categories.
Despite Efficient Net’s superior numerical
accuracy, ViT provides better interpretability,
making it ideal for explainable AI applications in
agriculture. Additionally, ViT’s self-attention
mechanism allows it to focus on salient disease
regions, which can be leveraged for further
optimization, including hybrid CNN-Transformer
architectures. Future research will explore fine-tuning
strategies and data augmentation techniques to
improve ViT’s classification performance.
5 CONCLUSIONS
In this study, we compared ViT and Efficient Net for
tomato disease classification. Efficient Net achieved
95% accuracy, outperforming ViT, which attained
84% accuracy. However, despite the lower accuracy,
ViT demonstrated certain advantages over CNN-
based models like Efficient Net. Transformers can
capture long-range dependencies in images, making
ViT more robust to spatial variations and complex
patterns.
ViT particularly excelled in distinguishing
diseases like Tomato YellowLeaf Curl Virus (95.5%)
and Tomato Spider mites Two spotted spider mite
(98.8%), showing its potential in cases where fine-
grained features matter. However, CNNs like
Efficient Net leverage hierarchical feature extraction,
making them more effective for general classification
tasks, which resulted in their superior overall
accuracy.
One key limitation of ViT is its dependency on
large-scale and diverse datasets for effective learning.
Unlike CNNs, which can generalize well even on
moderately sized datasets, ViTs require significantly
more data to learn meaningful representations. By
increasing the dataset size and incorporating diverse
samples covering different lighting conditions,
angles, and disease severities, ViT’s performance can
surpass CNNs as transformers scale better with data.
Future work can explore pretraining ViT on larger
agricultural datasets, hybrid CNN-Transformer
architectures, and advanced augmentation techniques
to improve its generalization ability for real-world
plant disease diagnosis.
REFERENCES
A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn,
X. Zhai, and T. Unterthiner, “An image is worth 16x16
words: Transformers for image recognition at scale,”
arXiv preprint arXiv:2010.11929, Oct. 2021.
A. Mishra, S. Hossain, and A. Sadeghian, “Image
processing techniques for detection of leaf disease,”
Journal of Agricultural Research, vol. 11, pp. 134–145,
2017.
A. D. S. Ferreira, D. M. Freitas, G. G. da Silva, H. Pistori,
and M. T. Folhes, “Weed detection in soybean crops
using convnets,” Computers and Electronics in
Agriculture, vol. 143, pp. 314–324, Oct. 2017.
A. Fuentes, S. Yoon, and S. Kim, “Automated crop disease
detection using deep learning: A review,” Computers
and Electronics in Agriculture, vol. 142, pp. 361–370,
2017.
A. Rangarajan, R. Purushothaman, and A. Ramesh,
“Diagnosis of plant leaf diseases using CNN-based
features,” Journal of Image Processing, vol. 32, pp.
123–135, 2018.
A. Singh, B. Ganapathysubramanian, A. Singh, and S.
Sarkar, “Machine learning for high-throughput stress
phenotyping in plants,” Trends in Plant Science, vol.
23, no. 10, pp. 883–898, 2018.
A. Khan and S. Ahmad, “Tomformer: A fusion model for
early and accurate detection of tomato leaf diseases
using transformers and CNNs,” arXiv preprint
arXiv:2312.16331, 2023.
C. Feng and M. Wu, “Edge computing for real-time plant
disease detection using lightweight transformer
models,” Computers and Electronics in Agriculture,
vol. 210, p. 108330, 2023.
D. P. Hughes and M. Salathé, “An open access repository
of images on plant health to enable the development of
mobile disease diagnostics,” arXiv preprint
arXiv:1511.08060, Nov. 2015.
H. Kim and J. Lee, “Vit-smartagri: Vision transformer and
smartphone-based plant disease detection for smart
agriculture,” Agronomy, vol. 14, no. 2, p. 327, Feb.
2024.
J. G. A. Barbedo, “Impact of dataset size and variety on the
effectiveness of deep learning and transfer learning for
plant disease recognition,” Computers and Electronics
in Agriculture, vol. 153, pp. 46–53, Aug. 2018.
J. Ma, Z. Zhou, Y. Wu, and X. Zheng, “Deep convolutional
neural networks for automatic detection of agricultural
pests and diseases,” Computers and Electronics in
Agriculture, vol. 151, pp. 83–90, 2018.
J. L. Bargul and N. Ghanbari, “Detection of leaf diseases in
tomato using machine learning approaches: A review,”
International Journal of Plant Pathology, vol. 12, no. 3,
pp. 150–160, Sep. 2020.