Abstract
Fruit detection and segmentation will be essential for future agronomic management, with applications in yield estimation, growth monitoring, intelligent picking, disease detection and etc. In order to more accurately and efficiently realize the recognition and segmentation of apples in natural orchards, a robust segmentation net framework specially developed for fruit production is proposed. This model was improved for the more challenging problem which segments the overlapped apples from the monochromatic background regardless of various corruptions. The method extends Mask R-CNN by embedding an attention mechanism for focusing more on the informative pixels but also suppressing the noise caused by adverse factors (occlusions, overlaps, etc.), which could be more suitable and robust for operating in complex natural environment. Specifically, the Gaussian non-local attention mechanism is transplanted into Mask R-CNN for refining the semantic features generated continuously by residual network and feature pyramid network, then the model forward processing based on the balanced feature levels and finally segments the regions where the apples are located. Experimental results verify the hypothesis of current work and show that the proposed method outperforms other start-of-the-art detection and segmentation models, the AP box and AP mask metric values have reached 85.6% and 86.2% in a reasonable run time, respectively, which can meet the precision and robustness of vision system in agronomic management.
Similar content being viewed by others
Abbreviations
- AP :
-
Average precision %
- AR :
-
Average recall %
- BFP:
-
Balanced feature pyramid
- CHT:
-
Circular hough transform
- CNN:
-
Convolutional neural networks
- FCN:
-
Fully convolution network
- FN:
-
False negative
- FPN:
-
Feature pyramid network
- IoU:
-
Intersection of union
- MLP:
-
Multiscale multilayered perceptron
- NMS:
-
Non-maximum suppression
- R-CNN:
-
Region-based convolutional network
- ResNet:
-
Residual network
- RoI:
-
Region of interests
- RPN:
-
Region proposal network
- RS-Net:
-
Robust segmentation net
- TP:
-
True positive
- WS:
-
Watershed segmentation
References
Aggelopoulou, A. D., Bochtis, D., Fountas, S., Swain, K. C., Gemtos, T. A., & Nanos, G. D. (2011). Yield prediction in apple orchards based on image processing. Precision Agriculture, 12(3), 448–456. https://doi.org/10.1007/s11119-010-9187-0
Bac, C. W., van Henten, E. J., Hemming, J., & Edan, Y. (2014). Harvesting robots for high-value crops: State-of-the-art review and challenges ahead. Journal of Field Robotics, 31(6), 888–911. https://doi.org/10.1002/rob.21525
Bargoti, S., & Underwood, J. (2017a). Deep fruit detection in orchards. In IEEE international conference on robotics and automation (ICRA), pp. 3626–3633. https://doi.org/10.1109/ICRA.2017.7989417
Bargoti, S., & Underwood, J. P. (2017b). Image segmentation for fruit detection and yield estimation in apple orchards. Journal of Field Robotics, 34(6), 1039–1060. https://doi.org/10.1002/rob.21699
Bolya, D., Zhou, C., Xiao, F., & Lee, Y. J. (2019). Yolact: Real-time instance segmentation. In Proceedings of the IEEE/CVF international conference on computer vision, pp. 9157–9166. https://doi.org/10.1109/ICCV.2019.00925
Bolya, D., Zhou, C., Xiao, F., & Lee, Y. J. (2020). Yolact++: Better real-time instance segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020,. https://doi.org/10.1109/TPAMI.2020.3014297
Cheein, F. A. A., & Carelli, R. (2013). Agricultural robotics: Unmanned robotic service units in agricultural tasks. IEEE Industrial Electronics Magazine, 7(3), 48–58. https://doi.org/10.1109/MIE.2013.2252957
Chen, K., Pang, J., Wang, J., et al. (2018). mmdetection. https://github.com/open-mmlab/mmdetection
Chen, L. C., Yang, Y., Wang, J., Xu, W., & Yuille, A. L. (2016). Attention to scale: Scale-aware semantic image segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3640–3649. https://doi.org/10.1109/CVPR.2016.396
Chen, S. W., Shivakumar, S. S., Dcunha, S., Das, J., Okon, E., Qu, C., Taylor, C., & Kumar, V. (2017). Counting apples and oranges with deep learning: A data-driven approach. IEEE Robotics and Automation Letters, 2(2), 781–788. https://doi.org/10.1109/LRA.2017.2651944
Dodge, S., & Karam, L. (2016). Understanding how image quality affects deep neural networks. In 2016 eighth International Conference on Quality of Multimedia Experience (QoMEX), pp. 1–6. org/https://doi.org/10.1109/QoMEX.2016.7498955
Fu, C. Y., Shvets, M., & Berg, A. C. (2019a). RetinaMask: Learning to predict masks improves state-of-the-art single-shot detection for free. arXiv:1901.03353.
Fu, J., Liu, J., Tian, H., Li, Y., Bao, Y., Fang, Z., & Lu, H. (2019b). Dual attention network for scene segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3146–3154. https://doi.org/10.1109/CVPR.2019.00326
Fu, L., Majeed, Y., Zhang, X., Karkee, M., & Zhang, Q. (2020a). Faster R–CNN–based apple detection in dense-foliage fruiting-wall trees using RGB and depth features for robotic harvesting. Biosystems Engineering, 197, 245–256. https://doi.org/10.1016/j.biosystemseng.2020.07.007
Fu, Z., Jiang, J., Gao, Y., Krienke, B., Wang, M., Zhong, K., Cao, Q., Tian, Y., Zhu, Y., Cao, W., & Liu, X. (2020b). Wheat growth monitoring and yield estimation based on multi-rotor unmanned aerial vehicle. Remote Sensing, 12(3), 508. https://doi.org/10.3390/rs12030508
Gené-Mola, J., Vilaplana, V., Rosell-Polo, J. R., Morros, J. R., Ruiz-Hidalgo, J., & Gregorio, E. (2019). Multi-modal deep learning for Fuji apple detection using RGB-D cameras and their radiometric capabilities. Computers and Electronics in Agriculture, 162, 689–698. https://doi.org/10.1016/j.compag.2019.05.016
Gongal, A., Amatya, S., Karkee, M., Zhang, Q., & Lewis, K. (2015). Sensors and systems for fruit detection and localization: A review. Computers and Electronics in Agriculture, 116, 8–19. https://doi.org/10.1016/j.compag.2015.05.021
He, K., Gkioxari, G., Dollár, P., & Girshick, R. (2017). Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969. https://doi.org/10.1109/ICCV.2017.322
Hung, C., Underwood, J., Nieto, J., & Sukkarieh, S. (2015). A feature learning based approach for automated fruit yield estimation. In Field and Service Robotics, pp. 485–498. https://doi.org/10.1007/978-3-319-07488-7_33
Ji, W., Zhao, D., Cheng, F., Xu, B., Zhang, Y., & Wang, J. (2012). Automatic recognition vision system guided for apple harvesting robot. Computers & Electrical Engineering, 38(5), 1186–1195. https://doi.org/10.1016/j.compeleceng.2011.11.005
Jia, W., Tian, Y., Luo, R., Zhang, Z., Lian, J., & Zheng, Y. (2020a). Detection and segmentation of overlapped fruits based on optimized mask R-CNN application in apple harvesting robot. Computers and Electronics in Agriculture, 172, 105380. https://doi.org/10.1016/j.compag.2020.105380
Jia, W., Zhang, Y., Lian, J., Zheng, Y., Zhao, D., & Li, C. (2020b). Apple harvesting robot under information technology: A review. International Journal of Advanced Robotic Systems, 17(3), 25310. https://doi.org/10.1177/1729881420925310
Jia, W., Zhao, D., Liu, X., Tang, S., Ruan, C., & Ji, W. (2015). Apple recognition based on K-means and GA-RBF-LMS neural network applicated in harvesting robot. Transactions of the Chinese Society of Agricultural Engineering, 31(18), 175–183. (in Chinese).
Kapach, K., Barnea, E., Mairon, R., Edan, Y., & Ben-Shahar, O. (2012). Computer vision for fruit harvesting robots–state of the art and challenges ahead. International Journal of Computational Vision and Robotics, 3(1–2), 4–34. https://doi.org/10.1504/IJCVR.2012.046419
Kim, S. W., Kook, H. K., Sun, J. Y., Kang, M. C., & Ko, S. J. (2018). Parallel feature pyramid network for object detection. In Proceedings of the European Conference on Computer Vision (ECCV), pp. 234–250. https://doi.org/10.1007/978-3-030-01228-1_15
Koirala, A., Walsh, K. B., Wang, Z., & McCarthy, C. (2019a). Deep learning–method overview and review of use for fruit detection and yield estimation. Computers and Electronics in Agriculture, 162, 219–234. https://doi.org/10.1016/j.compag.2019.04.017
Koirala, A., Walsh, K. B., Wang, Z., & McCarthy, C. (2019b). Deep learning for real-time fruit detection and orchard fruit load estimation: Benchmarking of ‘MangoYOLO.’ Precision Agriculture, 20(6), 1107–1135. https://doi.org/10.1007/s11119-019-09642-0
Kurtulmus, F., Lee, W. S., & Vardar, A. (2011). Green citrus detection using ‘eigenfruit’, color and circular Gabor texture features under natural outdoor conditions. Computers and Electronics in Agriculture, 78(2), 140–149. https://doi.org/10.1016/j.compag.2011.07.001
Li, Q., Jia, W., Sun, M., Hou, S., & Zheng, Y. (2021). A novel green apple segmentation algorithm based on ensemble U-Net under complex orchard environment. Computers and Electronics in Agriculture, 180, 105900. https://doi.org/10.1016/j.compag.2020.105900
Lin, T. Y., Goyal, P., Girshick, R., He, K., & Dollar, P. (2020). Focal loss for dense object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence., 42(2), 318–327. https://doi.org/10.1109/TPAMI.2018.2858826
Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., & Zitnick, C. L. (2014). Microsoft coco: Common objects in context. In European Conference on Computer Vision, pp. 740–755. https://doi.org/10.1007/978-3-319-10602-1_48
Linker, R. (2018). Machine learning based analysis of night-time images for yield prediction in apple orchard. Biosystems Engineering, 167, 114–125. https://doi.org/10.1016/j.biosystemseng.2018.01.003
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., & Berg, A. C. (2016a). Ssd: Single shot multibox detector. In European Conference on Computer Vision, pp. 21–37. https://doi.org/10.1007/978-3-319-46448-0_2
Liu, X., Jia, W., Ruan, C., Zhao, D., Gu, Y., & Chen, W. (2018). The recognition of apple fruits in plastic bags based on block classification. Precision Agriculture, 19(4), 735–749. https://doi.org/10.1007/s11119-017-9553-2
Liu, X., Zhao, D., Jia, W., Ruan, C., Tang, S., & Shen, T. (2016b). A method of segmenting apples at night based on color and position information. Computers and Electronics in Agriculture, 122, 118–123. https://doi.org/10.1016/j.compag.2016.01.023
Michaelis, C., Mitzkus, B., Geirhos, R., Rusak, E., Bringmann, O., Ecker, A. S., Bethge, E., & Brendel, W. (2019). Benchmarking robustness in object detection: Autonomous driving when winter is coming. arXiv:1907.07484.
Pang, J., Chen, K., Shi, J., Feng, H., Ouyang, W., & Lin, D. (2019). Libra R-CNN: Towards balanced learning for object detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 821–830. https://doi.org/10.1109/CVPR.2019.00091
Rahnemoonfar, M., & Sheppard, C. (2017). Deep count: Fruit counting based on deep simulated learning. Sensors, 17(4), 905. https://doi.org/10.3390/s17040905
Rakun, J., Stajnko, D., & Zazula, D. (2011). Detecting fruits in natural scenes by using spatial-frequency based texture analysis and multiview geometry. Computers and Electronics in Agriculture, 76(1), 80–88. https://doi.org/10.1016/j.compag.2011.01.007
Ren, S., He, K., Girshick, R., & Sun, J. (2017). Faster r-cnn: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6), 1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031
Sa, I., Ge, Z., Dayoub, F., Upcroft, B., Perez, T., & McCool, C. (2016). Deepfruits: A fruit detection system using deep neural networks. Sensors, 16(8), 1222. https://doi.org/10.3390/s16081222
Siegel, K. R., Ali, M. K., Srinivasiah, A., Nugent, R. A., & Narayan, K. V. (2014). Do we produce enough fruits and vegetables to meet global health need? PLoS ONE, 9(8), e104059. https://doi.org/10.1371/journal.pone.0104059
Shelhamer, E., Long, J., & Darrell, T. (2017). Fully convolutional networks for semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(4), 640–651. https://doi.org/10.1109/TPAMI.2016.2572683
Targ, S., Almeida, D., & Lyman, K. (2016). Resnet in ResNET: Generalizing residual architectures. arXiv:1603.08029.
Tian, Y., Duan, H., Luo, R., Zhang, Y., Jia, W., Lian, J., Zheng, Y., Ruan, C., & Li, C. (2019a). Fast recognition and location of target fruit based on depth information. IEEE Access, 7, 170553–170563. https://doi.org/10.1109/ACCESS.2019.2955566
Tian, Y., Yang, G., Wang, Z., Wang, H., Li, E., & Liang, Z. (2019b). Apple detection during different growth stages in orchards using the improved YOLO-V3 model. Computers and Electronics in Agriculture, 157, 417–426. https://doi.org/10.1016/j.compag.2019.01.012
Underwood, J. P., Hung, C., Whelan, B., & Sukkarieh, S. (2016). Mapping almond orchard canopy volume, flowers, fruit and yield using lidar and vision sensors. Computers and Electronics in Agriculture, 130, 83–96. https://doi.org/10.1016/j.compag.2016.09.014
Vasconez, J. P., Delpiano, J., Vougioukas, S., & Cheein, F. A. (2020). Comparison of convolutional neural networks in fruit detection and counting: A comprehensive evaluation. Computers and Electronics in Agriculture, 173, 105348. https://doi.org/10.1016/j.compag.2020.105348
Wang, Q., Nuske, S., Bergerman, M., & Singh, S. (2013). Automated crop yield estimation for apple orchards. In Experimental robotics, pp. 745–758. https://doi.org/10.1007/978-3-319-00065-7_50
Wang, X., Girshick, R., Gupta, A., & He, K. (2018). Non-local neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7794–7803. https://doi.org/10.1109/CVPR.2018.00813
Wei, X. S., Xie, C. W., Wu, J., et al. (2018). Mask-CNN: Localizing parts and selecting descriptors for fine-grained bird species categorization. Pattern Recognition, 76, 704–714. https://doi.org/10.1016/j.patcog.2017.10.002
Yu, Y., Zhang, K., Yang, L., & Zhang, D. (2019). Fruit detection for strawberry harvesting robot in non-structural environment based on Mask-RCNN. Computers and Electronics in Agriculture, 163, 104846. https://doi.org/10.1016/j.compag.2019.06.001
Zhang, J., Huang, Y., Pu, R., Gonzalez-Moreno, P., Yuan, L., Wu, K., & Huang, W. (2019). Monitoring plant diseases and pests through remote sensing technology: A review. Computers and Electronics in Agriculture, 165, 104943. https://doi.org/10.1016/j.compag.2019.104943
Zhang, Z., Heinemann, P. H., Liu, J., Baugher, T. A., & Schupp, J. R. (2016). The development of mechanical apple harvesting technology: A review. Transactions of the ASABE, 59(5), 1165–1180. https://doi.org/10.13031/trans.59.11737
Zhou, R., Damerow, L., Sun, Y., & Blanke, M. M. (2012). Using colour features of cv. ‘Gala’apple fruits in an orchard in image processing to predict yield. Precision Agriculture, 13(5), 568–580. https://doi.org/10.1007/s11119-012-9269-2
Acknowledgements
This work is supported by Natural Science Foundation of Shandong Province in China (No.: ZR2020MF076) Focus on Research and Development Plan in Shandong Province (No.: 2019GNC106115); National Nature Science Foundation of China (No.: 62072289); Shandong Province Higher Educational Science and Technology Program (No.: J18KA308); Taishan Scholar Program of Shandong Province of China (No.: TSHW201502038).
Author information
Authors and Affiliations
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Jia, W., Zhang, Z., Shao, W. et al. RS-Net: robust segmentation of green overlapped apples. Precision Agric 23, 492–513 (2022). https://doi.org/10.1007/s11119-021-09846-3
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11119-021-09846-3