Abstract
Recently, deep learning based models have demonstrated the superiority in a variety of visual tasks like object detection and instance segmentation. In practical applications, deploying advanced networks into real-time applications such as autonomous driving is still challenging due to expensive computational cost and memory footprint. In this paper, to reduce the size of deep convolutional neural network (CNN) and accelerate its reasoning, we propose a cross-layer manifold invariance based pruning method named CLMIP for network compression to help it complete real-time road type recognition in low-cost vision system. Manifolds are higher-dimensional analogues of curves and surfaces, which can be self-organized to reflect the data distribution and characterize the relationship between data. Therefore, we hope to guarantee the generalization ability of deep CNN by maintaining the consistency of the data manifolds of each layer in the network, and then remove the parameters with less influence on the manifold structure. Therefore, CLMIP can be regarded as a tool to further investigate the dependence of model structure on network optimization and generalization. To the best of our knowledge, this is the first time to prune deep CNN based on the invariance of data manifolds. During experimental process, we use the python based keyword crawler program to collect 102 first-view videos of car cameras, including 137 200 images (320 × 240) of four road scenes (urban road, off-road, trunk road and motorway). Finally, the classification results have demonstrated that CLMIP can achieve state-of-the-art performance with a speed of 26 FPS on NVIDIA Jetson Nano.
Similar content being viewed by others
References
Bône, A., Colliot, O., & Durrleman, S. (2018). Learning distributions of shape trajectories from longitudinal datasets: a hierarchical model on a manifold of diffeomorphisms. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 9271–9280), IEEE Press: Salt Lake City.
Bosch, A., Zisserman, A., & Muñoz, X. (2008). Scene classification using a hybrid generative/discriminative approach. IEEE Transactions Pattern Analysis and Machine Intelligence, 30(4), 712–727.
Chen, Q., et al. (2015). Contextualizing object detection and classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(1), 13–27.
Chollet, F. (2107). Xception: Deep learning with depthwise separable convolutions. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1800–1807), IEEE Press: Honolulu.
Ding, J., Chen, B., Liu, H., & Huang, M. (2016). Convolutional neural network with data augmentation for SAR target recognition. IEEE Geoscience and Remote Sensing Letters, 13(3), 364–368.
Ding, X., Ding, G., Han, J., & Tang, S. (2018). Auto-balanced filter pruning for efficient convolutional neural networks. In AAAI conference on artificial intelligence (AAAI) (pp. 6797–6804), AAAI Press: New Orleans.
Donahue, J. et al. (2014). DeCAF: A deep convolutional activation feature for generic visual recognition. In International conference on international conference on machine learning (ICML) (pp. 647–655), ACM Press: Beijing.
Ess, A., Müller, T., Grabner H., & Van Gool, L. (2009). Segmentation-based urban traffic scene understanding. In British machine vision conference (BMVC) (pp. 84.1–84.11), BMVA Press: London.
Gong, C., Yang, C., Yao, X., Guo, L., & Han, J. (2018). When deep learning meets metric learning: Remote sensing image scene classification via learning discriminative CNNs. IEEE Transactions on Geoscience and Remote Sensing, 56(5), 2811–2821.
Guan, H., Yan, W., Yu, Y., Zhong, L., & Li, D. (2018). Robust traffic-sign detection and classification using mobile LiDAR data with digital images. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 11(5), 1715–1724.
He, Y., Kang, G., Dong, X., Fu, Y., & Yang, Y. (2018). Soft filter pruning for accelerating deep convolutional neural networks. In International joint conference on artificial intelligence (IJCAI) (pp. 1–8), Morgan Kaufmann Press: Stockholm.
He, K., Zhang, X., Ren, S., & Sun, J. (2015). Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification. In IEEE international conference on computer vision (ICCV) (pp. 1026–1034), IEEE Press: Santiago.
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 770–778), IEEE Press: Las Vegas.
Hu, J., Shen, L., & Sun, G. (2018). Squeeze-and-excitation networks. In IEEE/CVF conference on computer vision and pattern recognition (CVPR) (pp. 7132–7141), IEEE Press: Salt Lake City.
Iandola, F. N. et al. (2017). SqueezeNet: AlexNet-level accuracy with 50 × fewer parameters and < 0.5 MB model size. In International conference on learning representations (ICLR), Toulon, France. https://openreview.net/pdf?id=S1xh5sYgx.
Jansen, P., Mark, W., Heuvel, J. C., & Groen, F. C. A. (2005). Colour based off-road environment and terrain type classification. In IEEE international conference on intelligent transportation systems (ICITS) (pp. 61–66), IEEE Press: Vienna, Austria.
Jeong, S., Simeone, O., & Kang, J. (2018). Mobile edge computing via a UAV-mounted cloudlet: Optimization of bit allocation and path planning. IEEE Transactions on Vehicular Technology, 67(3), 2049–2063.
Ketkar, N. (2017). Deep learning with python (1st ed., pp. 195–208). Berkeley: Apress.
Ko, J., Kim, D., Na, T., & Mukhopadhyay, S. (2018). Design and analysis of a neural network inference engine based on adaptive weight compression. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 38(1), 109–121.
Kwong, K. K. (2016). Some sharp Hodge Laplacian and Steklov eigenvalue estimates for differential forms. Calculus of Variations and Partial Differential Equations, 55(2), 38.
Lemley, J., Bazrafkan, S., & Corcoran, P. (2017). Smart augmentation learning an optimal data augmentation strategy. IEEE Access, 5, 5858–5869.
Li, K., Hariharan, B., & Malik, J. (2016). Iterative instance segmentation. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 3659–3667), IEEE Press: Las Vegas.
Li, X., Wang, W., Hu, X., & Yang, J. (2019). Selective kernel networks. In IEEE/CVF conference on computer vision and pattern recognition (CVPR) (pp. 510–519), IEEE Press: Long Beach.
Li, Y., et al. (2016b). Revisiting batch normalization for practical domain adaptation. Pattern Recognition, 80, 109–117.
Lin, S. et al. (2019). Towards optimal structured CNN pruning via generative adversarial learning. In IEEE/CVF conference on computer vision and pattern recognition (CVPR) (pp. 2785–2794), IEEE Press: Long Beach.
Liu, W. et al. (2016). SSD: Single shot MultiBox detector. In European conference on computer vision (ECCV) (pp. 1–17), Springer: Amsterdam.
Liu, Z., Xu, J., Peng, X., & Xiong, R. (2018). Frequency-domain dynamic pruning for convolutional neural networks. In Advances in neural information processing systems (NIPS) (pp. 1–11), MIT Press: Vancouver.
Menouar, H., et al. (2017). UAV-Enabled intelligent transportation systems for the smart city: Ap-plications and challenges. IEEE Communications Magazine, 55(3), 22–28.
Miguel, J., et al. (2015). A general framework for constrained Bayesian optimization using information-based search. Journal of Machine Learning Research, 17(1), 5549–5601.
Mioulet, L., Breckon, T., Mouton, A., Liang, H., & Morie, T. (2013) Gabor features for real-time road environment classification. In IEEE international conference on industrial technology (ICIT) (pp. 1117–1121), IEEE Press: Cape Town.
Mohammad, M., Kaloskampis, I., & Hicks, Y. (2015). Evolving GMMs for road-type classification. In IEEE international conference on industrial technology (ICIT) (pp. 1670–1673), IEEE Press: Seville.
Neyshabur, B., Bhojanapalli, S., McAllester, D., & Srebro, N. (2017). Exploring generalization in deep learning. In Advances in neural information processing systems (NIPS) (pp. 5947–5956), MIT Press: Canada.
Nguyen, V. D., Nguyen, H. V., Tran, D. T., Lee, S. J., & Jeon, J. W. (2017). Learning framework for robust obstacle detection, recognition, and tracking. IEEE Transactions on Intelligent Transportation Systems, 18(6), 1633–1646.
Ozgunalp, U., Fan, R., Ai, X., & Dahnoun, N. (2017). Multiple lane detection algorithm based on novel dense vanishing point estimation. IEEE Transactions on Intelligent Transportation Systems, 18(3), 621–632.
Pan, X., et al. (2018). Safe screening rules for accelerating twin support vector machine classification. IEEE Transactions on Neural Network and Learning Systems, 29(5), 1876–1887.
Park, E., Yoo, S., & Vajda, P. (2018). Value-aware quantization for training and inference of neural networks. In European conference on computer vision (ECCV) (pp. 1–18), Springer Press, Munich.
Paul, A., et al. (2018). Improved random forest for classification. IEEE Transactions on Image Processing, 27(8), 4012–4024.
Pezzotti, N., et al. (2017). Approximated and user steerable tSNE for progressive visual analytics. IEEE Transactions on Visualization and Computer Graphics, 23(7), 1739–1752.
Rajeswaran, A., Lowrey, K., Todorov, E. V., & Kakade, S. M. (2017). Towards generalization and simplicity in continuous control. In Advances in neural information processing systems (NIPS) (pp. 6550–6561), MIT Press: Canada.
Rasiwasia, N., & Vasconcelos, N. (2008). Scene classification with low dimensional semantic spaces and weak supervision. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1–6), IEEE Press: Anchorage.
Redmon, J., & Farhadi, A. (2017). YOLO9000: Better, faster, stronger. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 6517–6525), IEEE: Honolulu.
Russakovsky, O., et al. (2015). Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 211–252.
Shen, X., Tian, X., Liu, T., Xu, F., & Tao, D. (2018). Continuous dropout. IEEE Transactions on Neural Networks and Learning Systems, 29(9), 3926–3937.
Smith, J. S., Bo, W., & Wilamowski, B. M. (2018). Neural network training with levenberg-marquardt and adaptable weight compression. IEEE Transactions on Neural Networks and Learning Systems, 30(2), 580–587.
Szegedy, C. et al. (2015). Going deeper with convolutions. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 1–9), IEEE Press: Boston.
Tang, I., & Breckon, T. (2011). Automatic road environment classification. IEEE Transactions on Intelligent Transportation Systems, 12(2), 476–484.
Trittler, M., Rothermel, T., & Fichter, W. (2016). Autopilot for landing small fixed-wing unmanned aerial vehicles with optical sensors. Journal of Guidance, Control and Dynamics, 39(9), 1–11.
Tu, L. W. (2011). An introduction to manifolds. New York: Springer. https://doi.org/10.1007/978-0-387-48101-2.
Tung, F., & Mori, G. (2020). Deep neural network compression by in-parallel pruning-quantization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(3), 568–579.
Wang, Y., Xu, C., Xu, C., & Tao, D. (2018). Adversarial learning of portable student networks. In AAAI conference on artificial intelligence (AAAI) (pp. 4260–4267), AAAI Press: New Orleans.
Wu, H., & Prasad, S. (2018). Semi-supervised deep learning using pseudo labels for hyperspectral image classification. IEEE Transactions on Image Processing, 27(3), 1259–1270.
Yang, Y., Zhong, Z., Shen, T., & Lin, Z. (2018). Convolutional neural networks with alternately updated clique. In IEEE/CVF conference on computer vision and pattern recognition (CVPR) (pp. 2413–2422), IEEE Press: Salt Lake City.
Yang, X., et al. (2018b). Hyperspectral image classification with deep learning models. IEEE Transactions on Geoscience and Remote Sensing, 56(9), 5408–5423.
Yang, L., et al. (2018c). Improving deep neural network with multiple parametric exponential linear units. Neurocomputing, 301, 11–24.
Yu, X., Yu, Z., & Ramalingam, S. (2018). Learning strict identity mappings in deep residual networks. In IEEE/CVF conference on computer vision and pattern recognition (CVPR) (pp. 4432–4440), IEEE Press: Salt Lake City.
Zhang, T. et al. (2018). A systematic DNN weight pruning framework using alternating direction method of multipliers. In European conference on computer vision (ECCV) (pp. 1–18), Springer Press, Munich.
Zhang, X., Zhou, X., Lin, M., & Sun, J. (2018). ShuffleNet: An extremely efficient convolutional neural network for mobile devices. In IEEE/CVF conference on computer vision and pattern recognition (CVPR) (pp. 6848–6856), IEEE Press: Salt Lake City.
Zhang, J., et al. (2017). Road recognition from remote sensing imagery using incremental learning. IEEE Transactions on Intelligent Transportation Systems, 18(11), 2993–3005.
Zheng, Q., Tian, X., Jiang, N., & Yang, M. (2019). Layer-wise learning based stochastic gradient descent method for the optimization of deep convolutional neural network. Journal of Intelligent and Fuzzy Systems, 37(4), 5641–5654.
Zheng, Q., Tian, X., Yang, M., Wu, Y., & Su, H. (2020a). PAC-Bayesian framework based drop-path method for 2D discriminative convolutional network pruning. Multidimensional Systems and Signal Processing, 31(3), 793–827.
Zheng, Q., Yang, M., Tian, X., Jiang, N., & Wang, D. (2020b). A full stage data augmentation method in deep convolutional neural network for natural image classification. Discrete Dynamics in Nature and Society. https://doi.org/10.1155/2020/4706576.
Zheng, Q., Yang, M., Yang, J., Zhang, Q., & Zhang, X. (2018). Improvement of generalization ability of deep CNN via implicit regularization in two-stage training process. IEEE Access, 6, 15844–15869.
Zhong, J., Ding, G., Guo, Y., Han, J., & Wang, B. (2018). Where to prune: Using LSTM to guide end-to-end pruning. In International joint conference on artificial intelligence (IJCAI) (pp. 1–7), Morgan Kaufmann Press: Stockholm.
Zhuang, Z. et al. (2018). Discrimination-aware channel pruning for deep neural networks. In Advances in neural information processing systems (NIPS) (pp. 1–12), MIT Press: Vancouver.
Acknowledgements
This work was supported by National Key R&D Program of China (Grant No. 2018YFC0831503), National Natural Science Foundation of China (Grant No. 61571275), China Computer Program for Education and Scientific Research (Grant No. NGII20161001), Shandong Provincial Natural Science Foundation (Grant No. ZR2014FM010 and No. ZR2014FM030), and Fundamental Research Funds of Shandong University (Grant No. 2018JC040).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
For the convenience of readers following and understanding this paper, we have annotated all variables in Table 6.
Rights and permissions
About this article
Cite this article
Zheng, Q., Tian, X., Yang, M. et al. CLMIP: cross-layer manifold invariance based pruning method of deep convolutional neural network for real-time road type recognition. Multidim Syst Sign Process 32, 239–262 (2021). https://doi.org/10.1007/s11045-020-00736-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11045-020-00736-x