Abstract
The accurate and reliable fruit detection in orchards is one of the most crucial tasks for supporting higher level agriculture tasks such as yield mapping and robotic harvesting. However, detecting and counting small fruit is a very challenging task under variable lighting conditions, low-resolutions and heavy occlusion by neighboring fruits or foliage. To robustly detect small fruits, an improved method is proposed based on multiple scale faster region-based convolutional neural networks (MS-FRCNN) approach using the color and depth images acquired with an RGB-D camera. The architecture of MS-FRCNN is improved to detect lower-level features by incorporating feature maps from shallower convolution feature maps for regions of interest (ROI) pooling. The detection framework consists of three phases. Firstly, multiple scale feature extractors are used to extract low and high features from RGB and depth images respectively. Then, RGB-detector and depth-detector are trained separately using MS-FRCNN. Finally, late fusion methods are explored for combining the RGB and depth detector. The detection framework was demonstrated and evaluated on two datasets that include passion fruit images under variable illumination conditions and occlusion. Compared with the faster R-CNN detector of RGB-D images, the recall, the precision and F1-score of MS-FRCNN method increased from 0.922 to 0.962, 0.850 to 0.931 and 0.885 to 0.946, respectively. Furthermore, the MS-FRCNN method effectively improves small passion fruit detection by achieving 0.909 of the F1 score. It is concluded that the detector based on MS-FRCNN can be applied practically in the actual orchard environment.
Similar content being viewed by others
References
Alom, M. Z., Yakopcic, C., Hasan, M., Taha, T. M., & Asari, V. K. (2019). Recurrent residual U-Net for medical image segmentation. Journal of medical imaging (Bellingham, Wash.), 6(1), 014006. https://doi.org/10.1117/1.Jmi.6.1.014006.
Bargoti, S., & Underwood, J. (2017). Deep fruit detection in orchards. In Proceedings of the IEEE International Conference on Robotics & Automation (pp. 3626–3633, https://doi.org/10.1109/ICRA.2017.7989417.
Cai, Z. W., Vasconcelos, N., & IEEE. (2018). Cascade R-CNN: Delving into High Quality Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 6154–6162,). New York: IEEE.
Chen, S. W., Skandan, S. S., Dcunha, S., Das, J., Kumar, V. J. I. R., & Letters, A. (2017). Counting apples and oranges with deep learning: A data driven approach. IEEE Robotics and Automation Letters, 2(2), 781–788.
Dalal, N., Triggs, B., & Schmid, C. (2006). Human detection using oriented histograms of flow and appearance. In Proceedings of the European Conference on Computer Vision (pp. 886–893, https://doi.org/10.1007/11744047_33. Graz, Austria.
Deng, J., Dong, W., Socher, R., Li, L. J., & Li, F. F. (2009). ImageNet: A Large-Scale Hierarchical Image Database. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (pp. 248–255, https://doi.org/10.1109/CVPR.2009.5206848.
Eggert, C., Brehm, S., Winschel, A., Dan, Z., & Lienhart, R .(2017). A closer look: Small object detection in faster R-CNN. In Proceedings of the 2017 IEEE International Conference on Multimedia and Expo (ICME) (pp. 421–426, https://doi.org/10.1109/ICME.2017.8019550.
Gan, H., Lee, W. S., Alchanatis, V., Ehsani, R., & Schueller, J. K. (2018). Immature green citrus fruit detection using color and thermal images. Computers and Electronics in Agriculture, 152, 117–125. https://doi.org/10.1016/j.compag.2018.07.011.
Gené-Mola, J., Vilaplana, V., Rosell-Polo, J. R., Morros, J.-R., Ruiz-Hidalgo, J., & Gregorio, E. (2019). Multi-modal deep learning for Fuji apple detection using RGB-D cameras and their radiometric capabilities. Computers and Electronics in Agriculture, 162, 689–698. https://doi.org/10.1016/j.compag.2019.05.016.
Gongal, A., Amatya, S., Karkee, M., Zhang, Q., & Lewis, K. (2015). Sensors and systems for fruit detection and localization: A review. Computers and Electronics in Agriculture, 116, 8–19.
Gongal, A., Silwal, A., Amatya, S., Karkee, M., Zhang, Q., & Lewis, K. (2016). Apple crop-load estimation with over-the-row machine vision system. Computers and Electronics in Agriculture, 120, 26–35. https://doi.org/10.1016/j.compag.2015.10.022.
Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., Bing, X., Warde-Farley, D., Ozair, S., et al. (2014). Generative adversarial nets. In Proceedings of the International Conference on Neural Information Processing Systems (pp. 2672–2680).
Han, L., Lee, W. S., & Wang, K. J. P. A. (2016). Immature green citrus fruit detection and counting based on fast normalized cross correlation (FNCC) using natural outdoor colour images. Precision Agriculture, 17(6), 1–20.
Häni, N., Roy, P., & Isler, V. (2019). A comparative study of fruit detection and counting methods for yield mapping in apple orchards. Journal of Field Robotics,. https://doi.org/10.1002/rob.21902.
He, K., Zhang, X., Ren, S., & Jian, S. (2016). Identity mappings in deep residual networks. In Proceedings of the European Conference on Computer Vision (pp. 630–645, https://doi.org/10.1007/978-3-319-46493-0_38.
Kestur, R., Meduri, A., & Narasipura, O. (2019). MangoNet: A deep semantic segmentation architecture for a method to detect and count mangoes in an open orchard. Engineering Applications of Artificial Intelligence, 77, 59–69. https://doi.org/10.1016/j.engappai.2018.09.011.
Kisantal, M., Wojna, Z., Murawski, J., Naruniec, J., & Cho, K. (2019). Augmentation for small object detection. Computer Vision and Pattern Recognitionar. https://arxiv.org/abs/1902.07296.
Koen, B. V. (1985). Definition of the Engineering Method. ASEE Publications, Suite 200, 11 Dupont Circle,Washington, DC 20036: ASEE Publications.
Koirala, A., Walsh, K. B., Wang, Z., & McCarthy, C. (2019a). Deep learning—Method overview and review of use for fruit detection and yield estimation. Computers and Electronics in Agriculture, 162, 219–234. https://doi.org/10.1016/j.compag.2019.04.017.
Koirala, A., Walsh, K. B., Wang, Z., & McCarthy, C. (2019b). Deep learning for real-time fruit detection and orchard fruit load estimation: Benchmarking of ‘MangoYOLO’. Precision Agriculture, 20(6), 1107–1135. https://doi.org/10.1007/s11119-019-09642-0.
Le, T. H. N., Zheng, Y., Zhu, C., Luu, K., & Savvides, M. (2016). Multiple Scale Faster-RCNN approach to driver’s cell-phone usage and hands on steering wheel detection. Proceedings of the Computer Vision and Pattern Recognition Workshops. https://doi.org/10.1109/CVPRW.2016.13.
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444. https://doi.org/10.1038/nature14539.
Lin, T. Y., Maire, M., Belongie, S., Bourdev, L., Girshick, R., Hays, J., et al. (2014). Microsoft COCO: Common objects in context. In Proceedings of the European Conference on Computer Vision (pp. 740–755, https://doi.org/10.1007/978-3-319-10602-1_48.
Lin, T. Y., Dollár, P., Girshick, R., He, K., & Belongie, S. (2017). Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2117–2125, https://doi.org/10.1109/CVPR.2017.106.
Lu, J., Lee, W. S., Gan, H., & Hu, X. (2018). Immature citrus fruit detection based on local binary pattern feature and hierarchical contour analysis. Biosystems Engineering, 171, 78–90. https://doi.org/10.1016/j.biosystemseng.2018.04.009.
Pongener, A., Sagar, V., Pal, R. K., Asrey, R., Sharma, R. R., & Singh, S. K. (2014). Physiological and quality changes during postharvest ripening of purple passion fruit (Passiflora edulis Sims). Fruits, 69(1), 19–30. https://doi.org/10.1051/fruits/2013097.
Ren, W., Yan, S., Yi, S., Dang, Q., & Gang, S. (2015). Deep image: Scaling up image recognition. Computer Science. https://doi.org/10.1038/nature0693.
Romberg, S., Pueyo, L. G., Lienhart, R., & Zwol, R. V. (2011). Scalable logo recognition in real-world images. In Proceedings of the 1st International Conference on Multimedia Retrieval. https://doi.org/10.1145/1991996.1992021.
Sa, I., Ge, Z., Dayoub, F., Upcroft, B., Perez, T., & McCool, C. (2016). DeepFruits: A fruit detection system using deep neural networks. Sensors (Basel), 16(8), 1222. https://doi.org/10.3390/s16081222.
Simard, P., Steinkraus, D., & Platt, J. C. (2003). Best practices for convolutional neural networks applied to visual document analysis. In 7th International Conference on Document Analysis and Recognition (ICDAR 2003), (Vol. 2, pp. 958–962). Edinburgh, Scotland, UK.
Song, Y., Glasbey, C. A., Horgan, G. W., Polder, G., Dieleman, J. A., & van der Heijden, G. W. A. M. (2014). Automatic fruit recognition and counting from multiple images. Biosystems Engineering, 118, 203–215. https://doi.org/10.1016/j.biosystemseng.2013.12.008.
Stein, M., Bargoti, S., & Underwood, J. (2016). Image based mango fruit detection, localisation and yield estimation using multiple view geometry. Sensors (Basel), 16(11), 1915. https://doi.org/10.3390/s16111915.
Tu, S., Xue, Y., Zheng, C., Qi, Y., Wan, H., & Mao, L. (2018). Detection of passion fruits and maturity classification using Red-Green-Blue Depth images. Biosystems Engineering, 175, 156–167. https://doi.org/10.1016/j.biosystemseng.2018.09.004.
Vitzrabin, E., & Edan, Y. (2016). Adaptive thresholding with fusion using a RGBD sensor for red sweet-pepper detection. Biosystems Engineering, 146, 45–56. https://doi.org/10.1016/j.biosystemseng.2015.12.002.
Xu, L., & Lv, J. (2017). Recognition method for apple fruit based on SUSAN and PCNN. Multimedia Tools and Applications, 77(6), 7205–7219. https://doi.org/10.1007/s11042-017-4629-6.
Yang, L., Zhang, L. Y., Dong, H. W., Alelaiwi, A., & El Saddik, A. (2015). Evaluating and improving the depth accuracy of kinect for windows v2. IEEE Sensors Journal, 15(8), 4275–4285. https://doi.org/10.1109/jsen.2015.2416651.
Acknowledgements
This work was supported by the Science and Technology Planning Project of Guangdong Province (2015A020224038 and 2015A020209148), and the National Natural Science Foundation of China (31600591 and 61772209).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Tu, S., Pang, J., Liu, H. et al. Passion fruit detection and counting based on multiple scale faster R-CNN using RGB-D images. Precision Agric 21, 1072–1091 (2020). https://doi.org/10.1007/s11119-020-09709-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11119-020-09709-3