Elsevier

Biosystems Engineering

Volume 193, May 2020, Pages 206-215
Biosystems Engineering

Research Paper
A method of green citrus detection based on a deep bounding box regression forest

https://doi.org/10.1016/j.biosystemseng.2020.03.001Get rights and content

Highlights

  • Citrus detection and bounding box regression were achieved both in a model.

  • Probabilistic class label of the image patches was designed for model training.

  • Multi-scale features of colour, shape and texture were fused to describe citrus.

  • The features and the output of each layer were concatenated to train a deep model.

The visual recognition of green fruits in natural environments is always a difficult problem for agricultural robots due to the colour similarity between fruits and background. This study proposed a green fruit detection method named deep bounding box regression forest (DBBRF) for detecting green citrus in natural environments. First, probabilistic class labels were designed for image patches used as training samples for the forest model. Then, objective functions were used alternately while constructing the single-layer regression forest to reduce the class uncertainty and bounding box uncertainty of the object. Furthermore, the output of the model, taken as the new features, was concatenated with the input features to train multiple regression forests. With regard to the feature extraction, a multi-scale fusion feature was designed to describe the citrus in different scales with three aspects of features, including shape, texture, and colour. By testing 800 randomly selected citrus images, the experimental results showed that the average execution time of the method was 0.759 s and that the mAP was 87.6%. This study provides technical support for green fruit detection in natural environments.

Introduction

The tasty and nutritious citrus is an important economic crop. Yield estimation and fruit picking are important for the citrus industry. Manual work with high labour costs and time consumption has forced the agriculture industry to utilise automated robots. Accurate and efficient visual detection of fruit objects is one of the key requirements for agricultural harvest robots. In recent years, dozens of studies have focused on fruit detection using machine vision.

Because there are many fruits with obvious colour characteristics in nature, segmentation of fruit from images using colour features is commonly used and easily implemented fruit recognition method. Some methods (Arefi and Motlagh, 2013, Goel and Sehgal, 2015, Luo et al., 2016) utilised colour information to classify each pixel of fruit images, which were easy to implement but difficult to adopt for green citrus detection because of the colour similarity between leaves and fruits.

Due to the existence of many spherical fruits in nature, such as apples, pomegranates and citrus, many fruit recognition studies used the Hough transform circle detection, such as He et al., 2017, Lu and Hu, 2017 and Nasiri, Glozarian, and Akhani (2015). The advantage of the Hough transform is the ability to integrate information from a large number of parts (Leibe, Leonardis, & Bernt, 2008), but the disadvantages are the weak robustness to deformation and the requirements of a huge parameter space and calculations.

Additionally, classification methods based on image patch also represent a very popular solution for fruit detection. This category of methods has the distinct advantage of flexibly extracting abundant and robust features using different feature descriptors. For example, Zhao, Lee, and He (2016) designed a method named sum of absolute transformed difference (SATD) that calculated the absolute difference between some pixels in an image patch for citrus recognition. Kurtulmus, Lee, and Vardar (2011) developed an algorithm using circular Gabor texture and ‘eigenfruit’ similarity for locating green citrus. Linker (2018) used image patches to extract SURF features for green apple detection. However, these methods processed each image patch separately. They could not integrate information from each part as the Hough transform does.

Hough transform and the methods of classifying image patches have their own advantages and disadvantages. Therefore, a question to be explored in this paper is whether there is a method that combines the advantages and compensates the disadvantages of these two methods to further achieve better performance. The Hough Forest (Gall & Lempitsky, 2009) is a typical method elegantly combining the above two methods. In this study, an improved Hough Forest method named deep bounding box regression forest (DBBRF) was proposed for green citrus detection with better performance.

The main contributions of DBBRF are summarised as follows:

  • (1)

    The ability to combine Hough transform and image patches classification. DBBRF has the ability to integrate information from each part, as Hough transform does, and to extract rich features using image patches.

  • (2)

    A model with better performance. Compared with the Hough forest, the random forest is replaced by the deep forest to help construct an improved model.

  • (3)

    Achieve both classification and location in a model. Compared with the Hough forest, DBBRF adds bounding box regression to the model, greatly enhancing the adaptability to variations of object size and aspect ratio.

  • (4)

    Probabilistic class labels designed for training samples. This can improve the performance of the model when samples with approximate Intersection-over-Union (IOU) (Everingham et al., 2006) are allotted different class labels. At the same time, scale jittering was used for data augmentation to improve the generalization ability of the model with scale.

  • (5)

    Multi-scale and multi-class feature extraction. Three types of features, including colour, shape and texture, were extracted. Additionally, an easy-to-implement multi-scale feature extraction method was designed.

Section snippets

Related work

The Hough forest is a method that performs the classification and location simultaneously. It can be regarded as a combination of the random forest and the Hough transform. However, it is necessary to assume that the aspect ratio (ratio of width to height) of the object is constant when using the Hough forest. This assumption greatly limits the application scenarios of the Hough forest.

In terms of location problems, bounding box regression (Felzenszwalb, McAllester & Ramanan, 2008) is a method

Materials

The vision system in this study consisted of a CCD camera and a tripod, shown as Fig. 1. The camera model was MV-VDM200SM/SC with a field of view of 43.60°. The citrus varieties included the Emperor Citrus and Tangerine. The diameters of the citrus fruits were between 40 mm and 90 mm. The image acquisition dates were July 15, 2017 and July 16, 2017, and the weather was sunny. The shooting location was Zengcheng Orchard, Guangdong, China. During image acquisition, the average shooting distance

Deep bounding box regression forest

DBBRF is a model for classification and location similar to the Hough forest. Although the Hough forest can achieve classification and location at the same time, it can only predict the centres of citrus objects and has poor adaptability to the changes of object size and aspect ratio. To solve this problem, DBBRF adds the prediction of the bounding box of objects based on bounding box regression. Meanwhile, instead of the random forest used by the Hough forest, this paper employed the deep

Feature extraction of citrus

A citrus image was described by the features of shape, texture and colour in this study. The detection image was divided into multiple cells, and different types of features were extracted by setting different sizes of cells. This operation obtained features of different image granularities, as well as reduced correlations between different types of features.

The shape features were extracted by the histogram of oriented gradient (HOG). Some hyper parameters were preset referring to the

Experiments

The algorithm in this study was implemented in MATLAB 2015b (https://uk.mathworks.com/products/matlab.html) on a computer with a 3.40-GHz CPU, 8 GB of memory, and the 64-bit Windows 7 operating system. The experimental images used in the training model were obtained by cutting the image patches from 200 randomly selected images, including 100 Emperor Citrus images and 100 Tangerine images. A total of 9000 image patches were cropped, including 4000 positive samples and 5000 negative samples. The

Conclusion

In view of green citrus detection in natural environments, this study proposed a deep bounding box regression forest model. With 800 tested images, the achieved mAP was 87.6%, and the average running time for a single image was 0.759 s. The proposed method can perform visual detection of green citrus and provides technical support for detection of similar fruits, such as apples, kiwi fruit, pears, star fruit, etc. There are three novelties in this paper. (1) In contrast to the Hough forest,

Author contributions

J.T. Xiong made contributions in design of data acquisition, all experiments, results interpretation and manuscript preparation. Z.L. He contributed in the implementation of the experiments, data acquisition, experimental work, data analysis, and manuscript writing. Z.G. Yang, Z. Zhong, S.M. Chen, S.F. Chen and Z.X. Li contributed to the writing of the manuscript.

Declaration of Competing Interest

The authors declare no conflict of interest.

Acknowledgments

This work is supported by the National Natural Science Foundation of China (No. 31201135), the Natural Science Foundation of Guangdong (No. 2018A030313330), Science and Technology Plan Project of Guangzhou (201802020032), Special Funds for the Cultivation of Guangdong College Students' Scientific and Technological Innovation (“Climbing Program” Special Funds.) (pdjh2020a0082). The authors wish to thank the useful comments of the anonymous reviewers to this paper. The authors also appreciate

References (27)

  • P. Felzenszwalb et al.

    A discriminatively trained, multiscale, deformable part model

  • J. Gall et al.

    Class-specific Hough forests for object detection

  • P. Geurts et al.

    Extremely randomized trees

    Machine Learning

    (2006)
  • Cited by (15)

    • Assigning apples to individual trees in dense orchards using 3D colour point clouds

      2021, Biosystems Engineering
      Citation Excerpt :

      The representative quantitative information obtained in this manner is generally referred to as a hand-crafted feature. Hand-crafted approaches can involve techniques such as colour thresholding, colour space clustering, shape analysis, blob detection, circular Hough transform, Ncut algorithm, employment of Histogram of Oriented Gradients (HOG), Local Binary Patterns (LBP) and Upright Speeded Up Robust Features (U-SURF) for separating fruits from the canopy (Wang et al., 2012; Sengupta & Lee, 2014; Sabzi et al., 2018; Tao & Zhou, 2017; Gongal et al., 2016; Nguyen et al., 2016; Roy & Isler, 2016; Bargoti & Underwood, 2017; Samiei et al., 2020; Sun et al., 2019; Gong et al., 2013; Lu et al., 2018; He et al., 2020; Kelman & Linker, 2014; Linker, 2018; Wu et al., 2019). Recently, deep learning methods have become commonplace for fruit detection and counting (Apolo-Apolo et al., 2020; Bargoti & Underwood, 2017; Bresilla et al., 2019; Chen et al., 2017; Häni, Roy, & Isler, 2018a, 2018b, 2020a, 2020b; Tian et al., 2019; Gené-Mola et al., 2019b; Liu et al., 2018; Tu et al., 2018; Williams et al., 2019; Fu et al., 2020; Gan et al., 2020; Xiong et al., 2020).

    • Selective fruit harvesting: Research, trends and developments towards fruit detection and localization – A review

      2023, Proceedings of the Institution of Mechanical Engineers, Part C: Journal of Mechanical Engineering Science
    • Accurate and rapid image segmentation method for bayberry automatic picking via machine learning

      2023, International Journal of Agricultural and Biological Engineering
    View all citing articles on Scopus
    View full text