Therefore, exploring visual Transformer models
based on zero sample learning and unsupervised
learning has important research value. In addition,
developing efficient and lightweight fine-grained
recognition algorithms is particularly crucial for
future development, especially in situations where
computing resources are limited. Building flexible
and powerful recognition systems within the
framework of a few sample and unsupervised
learning will have a profound impact on the future.
REFERENCES
Fu, J., Zheng, H., Mei, T., 2017. Look closer to see better:
recurrent attention convolutional neural network for
fine-grained image recognition. In Proceedings of the
IEEE Conference on Computer Vision and Pattern
Recognition, pp.4438-4446.
Gao, Y., Beijbom, O., Zhang, N., et al., 2016. Compact
bilinear pooling. In Proceedings of the IEEE
Conference on Computer Vision and Pattern
Recognition, pp.317-326.
Girshick, R., Donahue, J., Darrell, T., et al., 2014. Rich
feature hierarchies for accurate object detection and
semantic segmentation. In Proceedings of the IEEE
Conference on Computer Vision and Pattern
Recognition, pp.580-587.
He, J., Chen, J.N., Liu, S., et al., 2022. TransFG: a
transformer architecture for fine-grained recognition.
Proceedings of the AAAI Conference on Artificial
Intelligence, 36(1), pp.852-860.
Huang, S., Xu, Z., Tao, D., et al., 2016. Part-stacked CNN
for fine-grained visual categorization. In Proceedings of
the IEEE Conference on Computer Vision and Pattern
Recognition, pp.1173-1182.
Huang, Y., 2023. Fine-grained vehicle recognition
algorithm based on deep learning and its optimization
analysis. Integrated Circuit Applications, 40(3),
pp.270-273.
Krause, J., Jin, H., Yang, J., et al., 2015. Fine-grained
recognition without part annotations. In Proceedings of
the IEEE Conference on Computer Vision and Pattern
Recognition, pp.5546-5555.
Lin, T.Y., RoyChowdhury, A., Maji, S., 2015. Bilinear
CNN models for fine-grained visual recognition. In
Proceedings of the IEEE International Conference on
Computer Vision, pp.1449-1457.
Liu, X., Xia, T., Wang, J., et al., 2016. Fully convolutional
attention networks for fine-grained recognition.
arXiv:1603.06765.
Ma, Y., Zhi, M., Yin, Y., et al., 2022. A review of the
application of CNN and Transformer in fine-grained
image recognition. Computer Engineering and
Applications, 58(19), pp.53-63.
Peng, Y., He, X., Zhao, J., 2018. Object-Part Attention
Model for Fine-Grained Image Classification. IEEE
Transactions on Image Processing, 27(3), pp.1487-
1500.
Qi, L., Lu, X., Li, X., 2019. Exploiting spatial relation for
fine-grained image classification. Pattern Recognition,
91, pp.47-55.
Uijlings, J.R.R., van de Sande, K.E.A., Gevers, T., et al.,
2013. Selective search for object recognition.
International Journal of Computer Vision, 104(2),
pp.154-171.
Yang, Z., Luo, T., Wang, D., et al., 2018. Learning to
navigate for fine-grained classification. In Proc. Eur.
Conf. Comput. Vis., pp.420-435.
Yu, C., Zhao, X., Zheng, Q., et al., 2018. Hierarchical
bilinear pooling for fine-grained visual recognition. In
Proceedings of the European Conference on Computer
Vision (ECCV), pp.574-589.
Yu, Y., Wang, J., 2023. Hybrid granularities transformer
for fine-grained image recognition. Entropy, 25(4),
p.601.
Zhang, N., Donahue, J., Girshick, R., et al., 2014. Part-
based R-CNNs for fine-grained category detection. In
Computer Vision – ECCV 2014, Cham: Springer
International Publishing, pp.834-849.