Skip to main content
Log in

Deep Learning Based Application for Indoor Scene Recognition

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

Recognizing indoor scene and objects and estimating their poses present a wide range of applications in robotic field. This task becomes more challenging especially in cluttered environments like the indoor scenery. Scaling up convnets presents a key component in achieving better accuracy results of deep convolutional neural networks. In this paper, we make use of the rethinked efficient neural networks and we fine-tune them in order to develop a new application used for indoor object and scene recognition system. This new application will be especially dedicated for blind and visually impaired persons to explore new indoor environments and to fully integrate in daily life. The proposed indoor object and scene recognition system achieves new state-of-the-art results in MIT 67 indoor dataset and in scene 15 dataset. We obtained 95.60% and 97% respectively as a recognition rate.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2

Similar content being viewed by others

References

  1. Breuer T, Macedo GRG, Hartanto R, Hochgeschwender N, Holz D, Hegger F, Jin Z, Muller C, Paulus J, Reckhaus M et al (2012) Johnny: an autonomous service robot for domestic environments. J Intell Robot Syst 66(1–2):245–272

    Article  Google Scholar 

  2. Geiger A, Lenz P, Urtasun R (2012) Are we ready for autonomous driving? The kitti vision benchmark suite. In: 2012 IEEE Conference on computer vision and pattern recognition (CVPR). IEEE, pp 3354–3361

  3. Yu J, Tao D, Wang M et al (2014) Learning to rank using user clicks and visual features for image retrieval. IEEE Trans Cybern 45(4):767–779

    Article  Google Scholar 

  4. Who: Vision impairment and blindness. http://www.who.int/mediacentre/factsheets/fs282/en/. Accessed 8 Jan 2020

  5. Rodrıguez A, Bergasa LM, Alcantarilla PF, Yebes J, Cela A (2012) Obstacle avoidance system for assisting visually impaired people. In: Proceedings of the IEEE intelligent vehicles symposium workshops, vol 35. Madrid, Spain, p 16

  6. Quattoni A, Torralba A (2009) Recognizing indoor scenes. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE, pp 413–420

  7. Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural categories. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR’06), New York, USA, pp 2169–2178

  8. Tan M, Le QV (2019) Efficientnet: Rethinking model scaling for convolutional neural networks. arXiv preprint arXiv:1905.11946

  9. Song S, Xiao J (2016) Deep sliding shapes for amodal 3d object detection in rgb-d images. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 808–816

  10. Couprie C, Farabet C, Najman L, LeCun Y (2014) Toward realtime indoor semantic segmentation using depth information. J Mach Learn Res 1:1–48

    Google Scholar 

  11. Yu J, Zhu C, Zhang J et al (2019) Spatial pyramid-enhanced NetVLAD with weighted triplet loss for place recognition. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2019.2908982

    Article  Google Scholar 

  12. Riadh A, Mouna A, Said Y et al (2019) Traffic signs detection for real-world application of an advanced driving assisting system using deep learning. Neural Process Lett. https://doi.org/10.1007/s11063-019-10115-8

    Article  Google Scholar 

  13. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the 2016 IEEE conference on computer vision and pattern recognition, CVPR2016, pp 779–788

  14. Redmon J, Farhadi A (2017) YOLO9000: better, faster, stronger. In: Proceedings of the 30th IEEE conference on computer vision and pattern recognition, (CVPR’17), Honolulu, Hawaii, USA pp 6517–6525

  15. Li LJ, Socher R, Fei-Fei L (2009) Towards total scene understanding: Classification, annotation and segmentation in an automatic framework. In: CVPR, pp 2036–2043

  16. Sudderth EB, Torralba A, Freeman WT (2005) Learning hierarchical models of scenes, objects, and parts. In: ICCV, pp 1331–1338

  17. Espinace P, Kollar T, Soto A, et al (2010) Indoor scene recognition through object detection. In : 2010 IEEE international conference on robotics and automation. IEEE, pp 1406–1413

  18. Wu P, Li Y, Yang F, Kong L, Hou Z (2018) A CLM-based method of indoor affordance areas classification for service robots. Jiqiren/Robot 40(2):188–194

    Google Scholar 

  19. Abu MA, Indra NH, Rahman AHA et al (2019) A study on image classification based on deep learning and tensorflow. Int J Eng Res Technol 12(4):563–569

    Google Scholar 

  20. Zhao Z-Q, Zheng P, Xu S-T et al (2019) Object detection with deep learning: a review. IEEE Trans Neural Netw Learn Syst 30(11):3212–3232

    Article  Google Scholar 

  21. Jiao L, Zhang F, Liu F et al (2019) A survey of deep learning-based object detection. IEEE Access 7:128837–128868

    Article  Google Scholar 

  22. Hong C, Yu J, Zhang J et al (2018) Multimodal face-pose estimation with multitask manifold deep learning. IEEE Trans Ind Inf 15(7):3952–3961

    Article  Google Scholar 

  23. Yu J, Tan M, Zhang H et al (2019) Hierarchical deep click feature prediction for fine-grained image recognition. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2019.2932058

    Article  Google Scholar 

  24. Yao J, Yu Z, Yu J, et al (2019) Single pixel reconstruction for one-stage instance segmentation. arXiv preprint arXiv:1904.07426

  25. Zhang J, Yu J, Tao D (2018) Local deep-feature alignment for unsupervised dimension reduction. IEEE Trans Image Process 27(5):2420–2432

    Article  MathSciNet  Google Scholar 

  26. Akilan T, Wu QMJ, Safaei A, Jiang W (2017) A late fusion approach for harnessing multi-CNN model high-level features. In: Proceedings of the 2017 IEEE international conference on systems, man, and cybernetics, SMC 2017, Windsor, Canada, pp 566–571

  27. Deng J, Dong W, Socher R, et al (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE, pp 248–255

  28. Lin T, Maire M, Belongie SJ et al (2014) Microsoft COCO: common objects in context. CoRR abs/1405.0312 (2014). arXiv preprint arXiv:1405.0312

  29. Xiao J, Hays J, Ehinger KA, Oliva A, Torralba A (2010) Sun database: large-scale scene recognition from abbey to zoo. In: Proceedings of CVPR

  30. Zhou B, Lapedriza A, Xiao J et al (2014) Learning deep features for scene recognition using places database. In: Advances in neural information processing systems, pp 487–495

  31. Afif M, Ayachi R, Said Y et al (2019) Indoor object classification for autonomous navigation assistance based on deep CNN model. In: 2019 IEEE international symposium on measurements and networking (M&N). IEEE, pp 1–4

  32. Afif M, Ayachi R, Said Y et al (2018) Indoor image recognition and classification via deep convolutional neural network. In: International conference on the sciences of electronics, technologies of information and telecommunications. Springer, Cham, pp 364–371

  33. Afif M, Ayachi R, Said Y et al (2020) An evaluation of RetinaNet on indoor object detection for blind and visually impaired persons assistance navigation. Neural Process Lett. https://doi.org/10.1007/s11063-020-10197-9

    Article  Google Scholar 

  34. He K, Zhang X, Ren S et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  35. Tan C, Sun F, Kong T et al (2018) A survey on deep transfer learning. In: International conference on artificial neural networks. Springer, Cham, pp 270–279

  36. Khan SH, Hayat M, Porikli F (2017) Scene categorization with spectral features. In: Proceedings of the IEEE international conference on computer vision, pp 5638–5648

  37. Seong H, Hyun J, Kim E (2019) FOSNet: an end-to-end trainable deep neural network for scene recognition. arXiv preprint arXiv:1907.07570

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mouna Afif.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Afif, M., Ayachi, R., Said, Y. et al. Deep Learning Based Application for Indoor Scene Recognition. Neural Process Lett 51, 2827–2837 (2020). https://doi.org/10.1007/s11063-020-10231-w

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-020-10231-w

Keywords

Navigation