Abstract
The accurate detection of young fruits in complex scenes is of great significance for automatic fruit growth monitoring systems. The images obtained in the open orchard contain interference factors including strong illumination, blur and occlusion, and the image quality is low. To improve the detection accuracy of young apples in low-quality images, a novel young apple detection algorithm that fuses the YOLOv4 network model and visual attention mechanism was proposed. The Non-local attention module (NLAM) and Convolutional block attention model (CBAM) were added to the baseline of the YOLOv4 model, and the proposed model was named YOLOv4–NLAM–CBAM. NLAM was used to extract the long-range dependency information from high-level visual features; CBAMs were used to further enhance the perception ability of the region of interest (ROI). To verify the effectiveness of the proposed algorithm, 3 000 young apple images were used for training and testing. The results showed that the detection precision, recall rate, average precision and F1 score of the YOLOv4–NLAM–CBAM model were 85.8%, 97.3%, 97.2% and 91.2%, respectively, and the average run time was 35.1 ms. For highlight/shadow, blur, severe occlusion and other images in test set, the average precision of the proposed algorithm was 98.0%, 96.2%, 97.0% and 96.9%, respectively. The experimental results showed that this method can achieve high-efficiency detection of low-quality images. The method can provide a certain reference for the research on automatic monitoring of young fruit growth.
Access this article
We’re sorry, something doesn't seem to be working properly.
Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.
Similar content being viewed by others
References
Behera, S. K., Mishra, N., Sethy, P. K., & Rath, A. K. 2018. On-tree detection and counting of apple using color thresholding and CHT. In IEEE 2018 International Conference on Communication and Signal Processing (ICCSP) (pp. 0224–0228). https://doi.org/10.1109/ICCSP.2018.8524363
Bresilla, K., Perulli, G. D., Boini, A., Morandi, B., Corelli Grappadelli, L., & Manfrini, L. (2019). Single-shot convolution neural networks for real-time fruit detection within the tree. Frontiers in Plant Science, 10, 611. https://doi.org/10.3389/fpls.2019.00611
Bochkovskiy, A., Wang, C. Y., & Liao, H. Y. M. (2020). YOLOv4: Optimal speed and accuracy of object detection. arXiv preprint. https://arxiv.org/abs/2004.10934
Chen, K., Wang, J., Pang, J., Cao, Y., Xiong, Y., Li, X., et al. (2019). MMDetection: Open mmlab detection toolbox and benchmark. arXiv preprint. https://arxiv.org/abs/1906.07155
Duong, L. T., Nguyen, P. T., Di Sipio, C., & Di Ruscio, D. (2020). Automated fruit recognition using EfficientNet and MixNet. Computers and Electronics in Agriculture, 171, 105326. https://doi.org/10.1016/j.compag.2020.105326
Häni, N., Roy, P., & Isler, V. (2020). Minneapple: A benchmark dataset for apple detection and segmentation. IEEE Robotics and Automation Letters, 5(2), 852–858. https://doi.org/10.1109/lra.2020.2965061
He, K., Zhang, X., Ren, S., & Sun, J. (2015). Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(9), 1904–1916. https://doi.org/10.1109/tpami.2015.2389824
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770–778).
He, Z., Xiong, J., Chen, S., Li, Z., Chen, S., Zhong, Z., & Yang, Z. (2020). A method of green citrus detection based on a deep bounding box regression forest. Biosystems Engineering, 193, 206–215. https://doi.org/10.1016/j.biosystemseng.2020.03.001
Hou, L., Wu, Q., Sun, Q., Yang, H., & Li, P. (2016). Fruit recognition based on convolution neural network. In Proceedings of 2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (pp. 18–22). https://doi.org/10.1109/FSKD.2016.7603144
Jiang, B., He, J., Yang, S., Fu, H., Li, T., Song, H., & He, D. (2019). Fusion of machine vision technology and AlexNet-CNNs deep learning network for the detection of postharvest apple pesticide residues. Artificial Intelligence in Agriculture, 1, 1–8. https://doi.org/10.1016/j.aiia.2019.02.001
Koirala, A., Walsh, K. B., Wang, Z., & McCarthy, C. (2019). Deep learning for real-time fruit detection and orchard fruit load estimation: Benchmarking of “MangoYOLO.” Precision Agriculture, 20(6), 1107–1135. https://doi.org/10.1007/s11119-019-09642-0
Lan, S., Ren, Z., Wu, Y., Davis, L. S., & Hua, G. (2020). SaccadeNet: A fast and accurate object detector. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 10397–10406). https://doi.org/10.1109/cvpr42600.2020.01041
Lin, G., Tang, Y., Zou, X., Cheng, J., & Xiong, J. (2020). Fruit detection in natural environment using partial shape matching and probabilistic Hough transform. Precision Agriculture, 21(1), 160–177. https://doi.org/10.1007/s11119-019-09662-w
Lu, J., Lee, W. S., Gan, H., & Hu, X. (2018). Immature citrus fruit detection based on local binary pattern feature and hierarchical contour analysis. Biosystems Engineering, 171, 78–90. https://doi.org/10.1016/j.biosystemseng.2018.04.009
Mao, S., Li, Y., Ma, Y., Zhang, B., Zhou, J., & Wang, K. (2020). Automatic cucumber recognition algorithm for harvesting robots in the natural environment using deep learning and multi-feature fusion. Computers and Electronics in Agriculture, 170, 105254. https://doi.org/10.1016/j.compag.2020.105254
Nguyen, P. T., Di Ruscio, D., Pierantonio, A., Di Rocco, J., & Iovino, L. (2021). Convolutional neural networks for enhanced classification mechanisms of metamodels. Journal of Systems and Software, 172, 110860. https://doi.org/10.1016/j.jss.2020.110860
O'Shea, K., & Nash, R. (2015). An introduction to convolutional neural networks. arXiv preprint. https://arxiv.org/abs/1511.08458
Redmon, J., & Farhadi, A. (2017). YOLO9000: Better, faster, stronger. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7263–7271). https://doi.org/10.1109/cvpr.2017.690
Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 779–788). https://doi.org/10.1109/cvpr.2016.91
Sun, S., Wu, Q., Jiao, L., Long, Y., He, D., & Song, H. (2018). Recognition of green apples based on fuzzy set theory and manifold ranking algorithm. Optik, 165, 395–407. https://doi.org/10.1016/j.ijleo.2018.03.085
Wang, D., & He, D. (2019). Recognition of apple targets before fruits thinning by robot based on R-FCN deep convolution neural network. Transactions of the Chinese Society of Agricultural Engineering, 35(3), 156–163. https://doi.org/10.11975/j.issn.1002-6819.2019.03.020
Wang, D., He, D., Song, H., Liu, C., & Xiong, H. (2019). Combining SUN-based visual attention model and saliency contour detection algorithm for apple image segmentation. Multimedia Tools and Applications, 78(13), 17391–17411. https://doi.org/10.1007/s11042-018-7106-y
Wang, X., Girshick, R., Gupta, A., & He, K. (2018). Non-local neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7794–7803). https://arxiv.org/abs/1711.07971v3
Woo, S., Park, J., Lee, J. Y., & So Kweon, I. (2018). Cbam: Convolutional block attention module. In Proceedings of the European conference on computer vision (pp. 3–19). https://doi.org/10.1007/978-3-030-01234-2_1
Wu, D., Lv, S., Jiang, M., & Song, H. (2020). Using channel pruning-based YOLO v4 deep learning algorithm for the real-time and accurate detection of apple flowers in natural environments. Computers and Electronics in Agriculture, 178, 105742. https://doi.org/10.1016/j.compag.2020.105742
Yu, L., Xiong, J., Fang, X., Yang, Z., Chen, Y., Lin, X., & Chen, S. (2021). A litchi fruit recognition method in a natural environment using RGB-D images. Biosystems Engineering, 204, 50–63. https://doi.org/10.1016/j.biosystemseng.2021.01.015
Zhang, H., Wu, C., Zhang, Z., Zhu, Y., Lin, H., Zhang, Z., Sun, Y., He, T., Mueller, J., Manmatha, R., Li, M., & Smola, A. (2020). Resnest: Split-attention networks. arXiv preprint. https://arxiv.org/abs/2004.08955
Acknowledgements
This work was supported by the National Key R&D Program of China (Grant No: 2019YFD1002401), the National High Technology Research and Development Program of China (863 Program) (No. 2013AA10230402) and the National Natural Science Foundation of China (Grant No: 31701326). The authors would like to thank all of the authors cited in this article and the anonymous referees for their helpful comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Jiang, M., Song, L., Wang, Y. et al. Fusion of the YOLOv4 network model and visual attention mechanism to detect low-quality young apples in a complex environment. Precision Agric 23, 559–577 (2022). https://doi.org/10.1007/s11119-021-09849-0
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11119-021-09849-0