Abstract
To accurately and efficiently distinguish the stem end and the blossom end of a navel orange from its black spots, we propose a feature skyscraper detector (FSD) with low computational cost, compact architecture and high detection accuracy. The main part of the detector is inspired from small object that the stem (blossom) end is complex and the black spot is densely distributed, so we design the feature skyscraper networks (FSN) based on dense connectivity. In particular, FSN is distinguished from regular feature pyramids, and which provides more intensive detection of high-level features. Then we design the backbone of the FSD based on attention mechanism and dense block for better feature extraction to the FSN. In addition, the architecture of the detector is also added Swish to further improve the accuracy. And we create a dataset in Pascal VOC format annotated three types of detection targets the stem end, the blossom end and the black spot. Experimental results on our orange dataset confirm that the FSD has competitive results to the state-of-the-art one-stage detectors like SSD, DSOD, YOLOv2, YOLOv3, RFB and FSSD, and it achieves 87.479% mAP at 131 FPS with only 5.812M parameters.
Similar content being viewed by others
Notes
This machine is jointly developed by Jiangxi Reemoon Sorting Equipment Co., Ltd. and Institute of Microelectronics of the Chinese Academy of Sciences.
References
Korf, H.J.: Survival of Phyllosticta citricarpa, Anamorph of the Citrus Black Spot Pathogen. University of Pretoria, Pretoria (1998)
Kamilaris, A., Prenafeta-Boldu, F.: Deep learning in agriculture: a survey. Comput. Electron. Agric. 147, 70–90 (2018). https://doi.org/10.1016/j.compag.2018.02.016
Shen, Z., Liu, Z., Li, J., Jiang, Y.-G., Chen, Y., Xue, X.: Dsod: learning deeply supervised object detectors from scratch. In: Proceedings of the IEEE International Conference on Computer Vision 2017, pp. 1919–1927
Huang, G., Liu, Z., van der Maaten, L., Weinberger, K.Q.: IEEE: densely connected convolutional networks. In: 30th IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI (2017)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. ArXiv e-prints (2017)
Ramachandran, P., Zoph, B., Le, Q.V.: Searching for activation functions. arXiv:1710.05941 (2017)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., Berg, A.C.: SSD: Single shot multibox detector. In: European Conference on Computer Vision 2016, pp. 21–37. Springer
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. arXiv preprint (2017)
Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. arXiv:1804.02767 (2018)
Liu, S., Huang, D.: Receptive field block net for accurate and fast object detection. In: Proceedings of the European Conference on Computer Vision (ECCV) 2018, pp. 385–400
Li, Z., Zhou, F.: FSSD: feature fusion single shot multibox detector. arXiv:1712.00960 (2017)
Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: Mobilenets: efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861 (2017)
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2018, pp. 4510–4520
He, K., Zhang, X., Ren, S., Sun, J.: IEEE: deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), p. 2016. Seattle, WA (2016)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556 (2014)
Behera, S.K., Jena, L., Rath, A.K., Sethy, P.K.: Disease classification and grading of orange using machine learning and fuzzy logic. In: 2018 International Conference on Communication and Signal Processing (ICCSP) 2018, pp. 0678–0682. IEEE (2018)
Rong, D., Ying, Y., Rao, X.: Embedded vision detection of defective orange by fast adaptive lightness correction algorithm. Comput. Electron. Agric. 138, 48–59 (2017)
Zhang, D., Lillywhite, K.D., Lee, D.-J., Tippetts, B.J.: Automated apple stem end and calyx detection using evolution-constructed features. J. Food Eng. 119(3), 411–418 (2013)
Liu, L., Ouyang, W., Wang, X., Fieguth, P., Chen, J., Liu, X., Pietikainen, M.: Deep learning for generic object detection: a survey. arXiv:1809.02165 (2018)
Agarwal, S., Terrail, J.O.D., Jurie, F.: Recent advances in object detection in the age of deep convolutional neural networks. arXiv:1809.03193 (2018)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2014, pp. 580–587
Girshick, R.: Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision 2015, pp. 1440–1448
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems 2015, pp. 91–99
Dai, J., Li, Y., He, K., Sun, J.: R-fcn: object detection via region-based fully convolutional networks. In: Advances in Neural Information Processing Systems 2016, pp. 379–387
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2016, pp. 779–788
Huang, J., Rathod, V., Sun, C., Zhu, M., Korattikara, A., Fathi, A., Fischer, I., Wojna, Z., Song, Y., Guadarrama, S.: Speed/accuracy trade-offs for modern convolutional object detectors. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2017, pp. 7310–7311
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017). https://doi.org/10.1145/3065386
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. ArXiv e-prints (2015)
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. ArXiv e-prints (2015)
Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.: Inception-v4. Inception-ResNet and the impact of residual connections on learning. ArXiv e-prints (2016)
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)
Lin, L., Dollr, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2017, pp. 2117–2125
Zhao, Q., Sheng, T., Wang, Y., Tang, Z., Chen, Y., Cai, L., Ling, H.: M2Det: a single-shot object detector based on multi-level feature pyramid network. arXiv:1811.04533 (2018)
Yang, Y., Zhong, Z., Shen, T., Lin, Z.: Convolutional neural networks with alternately updated clique. ArXiv e-prints (2018)
Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics 2011, pp. 315–323
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE International Conference on Computer Vision 2015, pp. 1026–1034
Clevert, D.-A., Unterthiner, T., Hochreiter, S.: Fast and accurate deep network learning by exponential linear units (elus). arXiv:1511.07289 (2015)
Klambauer, G., Unterthiner, T., Mayr, A., Hochreiter, S.: Self-normalizing neural networks. In: Advances in Neural Information Processing Systems 2017, pp. 971–980
Chen, L., Zhang, H., Xiao, J., Nie, L., Shao, J., Liu, W., Chua, T.-S.: Sca-cnn: spatial and channel-wise attention in convolutional networks for image captioning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2017, pp. 5659–5667
Wang, F., Jiang, M., Qian, C., Yang, S., Li, C., Zhang, H., Wang, X., Tang, X.: IEEE: residual attention network for image classification. In: 30th IEEE Conference on Computer Vision and Pattern Recognition. IEEE Conference on Computer Vision and Pattern Recognition, pp. 6450–6458 (2017)
Zeng, Y., van der Lubbe, J., Loog, M.: Multi-scale convolutional neural network for pixel-wise reconstruction of Van Goghs drawings. Mach. Vis. Appl. 30(7–8), 1229–1241 (2019)
Thendral, R., Suhasini, A.: Genetic algorithm based feature selection for detection of surface defects on oranges. J. Sci. Ind. Res. 75(9), 540–546 (2016)
Acknowledgements
This work was supported by National Key R&D Program of China (No. 2018YFD0700300).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Sun, X., Li, G. & Xu, S. FSD: feature skyscraper detector for stem end and blossom end of navel orange. Machine Vision and Applications 32, 11 (2021). https://doi.org/10.1007/s00138-020-01139-5
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00138-020-01139-5