Research PaperA method of green citrus detection based on a deep bounding box regression forest
Introduction
The tasty and nutritious citrus is an important economic crop. Yield estimation and fruit picking are important for the citrus industry. Manual work with high labour costs and time consumption has forced the agriculture industry to utilise automated robots. Accurate and efficient visual detection of fruit objects is one of the key requirements for agricultural harvest robots. In recent years, dozens of studies have focused on fruit detection using machine vision.
Because there are many fruits with obvious colour characteristics in nature, segmentation of fruit from images using colour features is commonly used and easily implemented fruit recognition method. Some methods (Arefi and Motlagh, 2013, Goel and Sehgal, 2015, Luo et al., 2016) utilised colour information to classify each pixel of fruit images, which were easy to implement but difficult to adopt for green citrus detection because of the colour similarity between leaves and fruits.
Due to the existence of many spherical fruits in nature, such as apples, pomegranates and citrus, many fruit recognition studies used the Hough transform circle detection, such as He et al., 2017, Lu and Hu, 2017 and Nasiri, Glozarian, and Akhani (2015). The advantage of the Hough transform is the ability to integrate information from a large number of parts (Leibe, Leonardis, & Bernt, 2008), but the disadvantages are the weak robustness to deformation and the requirements of a huge parameter space and calculations.
Additionally, classification methods based on image patch also represent a very popular solution for fruit detection. This category of methods has the distinct advantage of flexibly extracting abundant and robust features using different feature descriptors. For example, Zhao, Lee, and He (2016) designed a method named sum of absolute transformed difference (SATD) that calculated the absolute difference between some pixels in an image patch for citrus recognition. Kurtulmus, Lee, and Vardar (2011) developed an algorithm using circular Gabor texture and ‘eigenfruit’ similarity for locating green citrus. Linker (2018) used image patches to extract SURF features for green apple detection. However, these methods processed each image patch separately. They could not integrate information from each part as the Hough transform does.
Hough transform and the methods of classifying image patches have their own advantages and disadvantages. Therefore, a question to be explored in this paper is whether there is a method that combines the advantages and compensates the disadvantages of these two methods to further achieve better performance. The Hough Forest (Gall & Lempitsky, 2009) is a typical method elegantly combining the above two methods. In this study, an improved Hough Forest method named deep bounding box regression forest (DBBRF) was proposed for green citrus detection with better performance.
The main contributions of DBBRF are summarised as follows:
- (1)
The ability to combine Hough transform and image patches classification. DBBRF has the ability to integrate information from each part, as Hough transform does, and to extract rich features using image patches.
- (2)
A model with better performance. Compared with the Hough forest, the random forest is replaced by the deep forest to help construct an improved model.
- (3)
Achieve both classification and location in a model. Compared with the Hough forest, DBBRF adds bounding box regression to the model, greatly enhancing the adaptability to variations of object size and aspect ratio.
- (4)
Probabilistic class labels designed for training samples. This can improve the performance of the model when samples with approximate Intersection-over-Union (IOU) (Everingham et al., 2006) are allotted different class labels. At the same time, scale jittering was used for data augmentation to improve the generalization ability of the model with scale.
- (5)
Multi-scale and multi-class feature extraction. Three types of features, including colour, shape and texture, were extracted. Additionally, an easy-to-implement multi-scale feature extraction method was designed.
Section snippets
Related work
The Hough forest is a method that performs the classification and location simultaneously. It can be regarded as a combination of the random forest and the Hough transform. However, it is necessary to assume that the aspect ratio (ratio of width to height) of the object is constant when using the Hough forest. This assumption greatly limits the application scenarios of the Hough forest.
In terms of location problems, bounding box regression (Felzenszwalb, McAllester & Ramanan, 2008) is a method
Materials
The vision system in this study consisted of a CCD camera and a tripod, shown as Fig. 1. The camera model was MV-VDM200SM/SC with a field of view of 43.60°. The citrus varieties included the Emperor Citrus and Tangerine. The diameters of the citrus fruits were between 40 mm and 90 mm. The image acquisition dates were July 15, 2017 and July 16, 2017, and the weather was sunny. The shooting location was Zengcheng Orchard, Guangdong, China. During image acquisition, the average shooting distance
Deep bounding box regression forest
DBBRF is a model for classification and location similar to the Hough forest. Although the Hough forest can achieve classification and location at the same time, it can only predict the centres of citrus objects and has poor adaptability to the changes of object size and aspect ratio. To solve this problem, DBBRF adds the prediction of the bounding box of objects based on bounding box regression. Meanwhile, instead of the random forest used by the Hough forest, this paper employed the deep
Feature extraction of citrus
A citrus image was described by the features of shape, texture and colour in this study. The detection image was divided into multiple cells, and different types of features were extracted by setting different sizes of cells. This operation obtained features of different image granularities, as well as reduced correlations between different types of features.
The shape features were extracted by the histogram of oriented gradient (HOG). Some hyper parameters were preset referring to the
Experiments
The algorithm in this study was implemented in MATLAB 2015b (https://uk.mathworks.com/products/matlab.html) on a computer with a 3.40-GHz CPU, 8 GB of memory, and the 64-bit Windows 7 operating system. The experimental images used in the training model were obtained by cutting the image patches from 200 randomly selected images, including 100 Emperor Citrus images and 100 Tangerine images. A total of 9000 image patches were cropped, including 4000 positive samples and 5000 negative samples. The
Conclusion
In view of green citrus detection in natural environments, this study proposed a deep bounding box regression forest model. With 800 tested images, the achieved mAP was 87.6%, and the average running time for a single image was 0.759 s. The proposed method can perform visual detection of green citrus and provides technical support for detection of similar fruits, such as apples, kiwi fruit, pears, star fruit, etc. There are three novelties in this paper. (1) In contrast to the Hough forest,
Author contributions
J.T. Xiong made contributions in design of data acquisition, all experiments, results interpretation and manuscript preparation. Z.L. He contributed in the implementation of the experiments, data acquisition, experimental work, data analysis, and manuscript writing. Z.G. Yang, Z. Zhong, S.M. Chen, S.F. Chen and Z.X. Li contributed to the writing of the manuscript.
Declaration of Competing Interest
The authors declare no conflict of interest.
Acknowledgments
This work is supported by the National Natural Science Foundation of China (No. 31201135), the Natural Science Foundation of Guangdong (No. 2018A030313330), Science and Technology Plan Project of Guangzhou (201802020032), Special Funds for the Cultivation of Guangdong College Students' Scientific and Technological Innovation (“Climbing Program” Special Funds.) (pdjh2020a0082). The authors wish to thank the useful comments of the anonymous reviewers to this paper. The authors also appreciate
References (27)
- et al.
Fuzzy classification of pre-harvest tomatoes for ripeness estimation–An approach based on automatic rule learning using decision tree
Applied Soft Computing
(2015) - et al.
A method of green litchi recognition in natural environment based on improved LDA classifier
Computers and Electronics in Agriculture
(2017) - et al.
Green citrus detection using ‘eigenfruit’, color and circular Gabor texture features under natural outdoor conditions
Computers and Electronics in Agriculture
(2011) Machine learning based analysis of night-time images for yield prediction in apple orchard
Biosystems Engineering
(2018)- et al.
Immature green citrus detection based on colour feature and sum of absolute transformed difference (SATD) using colour images in the citrus grove
Computers and Electronics in Agriculture
(2016) - et al.
Development of an expert system based on wavelet transform and artificial neural networks for the ripe tomato harvesting robot
Australian Journal of Crop Science
(2013) Random forests
Machine Learning
(2001)- et al.
Decision forests: A unified framework for classification, regression, density estimation, manifold learning and semi-supervised learning
Foundations and Trends® in Computer Graphics and Vision
(2012) - et al.
Histograms of oriented gradients for human detection
- et al.
The 2005 pascal visual object classes challenge
A discriminatively trained, multiscale, deformable part model
Class-specific Hough forests for object detection
Extremely randomized trees
Machine Learning
Cited by (15)
Assigning apples to individual trees in dense orchards using 3D colour point clouds
2021, Biosystems EngineeringCitation Excerpt :The representative quantitative information obtained in this manner is generally referred to as a hand-crafted feature. Hand-crafted approaches can involve techniques such as colour thresholding, colour space clustering, shape analysis, blob detection, circular Hough transform, Ncut algorithm, employment of Histogram of Oriented Gradients (HOG), Local Binary Patterns (LBP) and Upright Speeded Up Robust Features (U-SURF) for separating fruits from the canopy (Wang et al., 2012; Sengupta & Lee, 2014; Sabzi et al., 2018; Tao & Zhou, 2017; Gongal et al., 2016; Nguyen et al., 2016; Roy & Isler, 2016; Bargoti & Underwood, 2017; Samiei et al., 2020; Sun et al., 2019; Gong et al., 2013; Lu et al., 2018; He et al., 2020; Kelman & Linker, 2014; Linker, 2018; Wu et al., 2019). Recently, deep learning methods have become commonplace for fruit detection and counting (Apolo-Apolo et al., 2020; Bargoti & Underwood, 2017; Bresilla et al., 2019; Chen et al., 2017; Häni, Roy, & Isler, 2018a, 2018b, 2020a, 2020b; Tian et al., 2019; Gené-Mola et al., 2019b; Liu et al., 2018; Tu et al., 2018; Williams et al., 2019; Fu et al., 2020; Gan et al., 2020; Xiong et al., 2020).
An efficient online citrus counting system for large-scale unstructured orchards based on the unmanned aerial vehicle
2023, Journal of Field RoboticsSelective fruit harvesting: Research, trends and developments towards fruit detection and localization – A review
2023, Proceedings of the Institution of Mechanical Engineers, Part C: Journal of Mechanical Engineering ScienceAccurate and rapid image segmentation method for bayberry automatic picking via machine learning
2023, International Journal of Agricultural and Biological EngineeringA longan yield estimation approach based on UAV images and deep learning
2023, Frontiers in Plant ScienceCitrus green fruit detection via improved feature network extraction
2022, Frontiers in Plant Science