tern recognition (CVPR’05), volume 1, pages 886–
893. Ieee.
Ge, Y., Xiong, Y., Tenorio, G. L., and From, P. J.
(2019). Fruit localization and environment percep-
tion for strawberry harvesting robots. IEEE Access,
7:147642–147652.
Grimstad, L. and From, P. J. (2017). The thorvald ii agri-
cultural robotic system. Robotics, 6(4):24.
Gu, J., Wang, Z., Kuen, J., Ma, L., Shahroudy, A., Shuai, B.,
Liu, T., Wang, X., Wang, G., Cai, J., et al. (2018). Re-
cent advances in convolutional neural networks. Pat-
tern Recognition, 77:354–377.
Handa, A., Patraucean, V., Badrinarayanan, V., Stent, S.,
and Cipolla, R. (2015). Scenenet: understanding real
world indoor scenes with synthetic data. arxiv preprint
(2015). arXiv preprint arXiv:1511.07041.
Harini, S., Deshpande, P., Dutta, J., and Rai, B. (2021). A
deep learning-based fruit quality assessment system.
In International Conference on Water Energy Food
and Sustainability, pages 187–192. Springer.
He, K., Gkioxari, G., Doll
´
ar, P., and Girshick, R. (2017).
Mask r-cnn. In Proceedings of the IEEE international
conference on computer vision, pages 2961–2969.
He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep resid-
ual learning for image recognition. In Proceedings of
the IEEE conference on computer vision and pattern
recognition, pages 770–778.
Huang, Z., Huang, L., Gong, Y., Huang, C., and Wang, X.
(2019). Mask scoring r-cnn. In Proceedings of the
IEEE/CVF conference on computer vision and pattern
recognition, pages 6409–6418.
Ilyas, T., Umraiz, M., Khan, A., and Kim, H. (2021). Dam:
Hierarchical adaptive feature selection using convolu-
tion encoder decoder network for strawberry segmen-
tation. Frontiers in plant science, 12:591333.
Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012). Im-
agenet classification with deep convolutional neural
networks. Advances in neural information processing
systems, 25.
Le Lou
¨
edec, J. and Cielniak, G. (2021). 3d shape sens-
ing and deep learning-based segmentation of straw-
berries. Computers and Electronics in Agriculture,
190:106374.
Lee, S., Arora, A. S., and Yun, C. M. (2022). Detect-
ing strawberry diseases and pest infections in the very
early stage with an ensemble deep-learning model.
Frontiers in Plant Science, 13:991134.
Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P.,
Ramanan, D., Doll
´
ar, P., and Zitnick, C. L. (2014).
Microsoft coco: Common objects in context. In Com-
puter Vision–ECCV 2014: 13th European Confer-
ence, Zurich, Switzerland, September 6-12, 2014, Pro-
ceedings, Part V 13, pages 740–755. Springer.
Lins Tenorio, G. and Caarls, W. (2021). Automatic visual
estimation of tomato cluster maturity in plant rows.
Machine Vision and Applications, 32(4):78.
Long, J., Shelhamer, E., and Darrell, T. (2015). Fully con-
volutional networks for semantic segmentation. In
Proceedings of the IEEE conference on computer vi-
sion and pattern recognition, pages 3431–3440.
Lowe, D. G. (2004). Distinctive image features from scale-
invariant keypoints. International journal of computer
vision, 60:91–110.
Nickolls, J. and Kirk, D. (2009). Graphics and computing
gpus. Computer Organization and Design: The Hard-
ware/Software Interface, DA Patterson and JL Hen-
nessy, 4th ed., Morgan Kaufmann, pages A2–A77.
Reis, D., Kupec, J., Hong, J., and Daoudi, A. (2023).
Real-time flying object detection with yolov8. arXiv
preprint arXiv:2305.09972.
Ren, G., Wu, H., Bao, A., Lin, T., Ting, K.-C., and Ying,
Y. (2023). Mobile robotics platform for strawberry
temporal–spatial yield monitoring within precision in-
door farming systems. Frontiers in Plant Science,
14:1162435.
Ronneberger, O., Fischer, P., and Brox, T. (2015). U-
net: Convolutional networks for biomedical image
segmentation. In Medical Image Computing and
Computer-Assisted Intervention–MICCAI 2015: 18th
International Conference, Munich, Germany, October
5-9, 2015, Proceedings, Part III 18, pages 234–241.
Springer.
Russell, B. C., Torralba, A., Murphy, K. P., and Freeman,
W. T. (2008). Labelme: a database and web-based
tool for image annotation. International journal of
computer vision, 77:157–173.
Sather, J. (2019). Viewpoint Optimization for Au-
tonomous Strawberry Harvesting with Deep Rein-
forcement Learning. PhD thesis, California Polytech-
nic State University.
Simonyan, K. and Zisserman, A. (2014). Very deep con-
volutional networks for large-scale image recognition.
arXiv preprint arXiv:1409.1556.
Tan, M. and Le, Q. (2019). Efficientnet: Rethinking model
scaling for convolutional neural networks. In Interna-
tional conference on machine learning, pages 6105–
6114. PMLR.
Viola, P. and Jones, M. (2001). Rapid object detection us-
ing a boosted cascade of simple features. In Proceed-
ings of the 2001 IEEE computer society conference on
computer vision and pattern recognition. CVPR 2001,
volume 1, pages I–I. Ieee.
Wold, J. P., O’Farrell, M., Andersen, P. V., and Tschudi, J.
(2021). Optimization of instrument design for in-line
monitoring of dry matter content in single potatoes by
nir interaction spectroscopy. Foods, 10(4):828.
Depth-Enhanced 3D Deep Learning for Strawberry Detection and Widest Region Identification in Polytunnels
481