Skip to main content
Log in

Semantic scene synthesis: application to assistive systems

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

Abstract

The aim of this work is to provide a semantic scene synthesis from a single depth image. This is used in assistive aid systems for visually impaired and blind people that allows them to understand their surroundings by the touch sense. The fact that blind people use touch to recognize objects and rely on listening to replace sight motivated us to propose this work. First, the acquired depth image is segmented and each segment is classified in the context of assistive systems using a deep learning network. Second, inspired by the Braille system and the Japanese writing system Kanji, the obtained classes are coded with semantic labels. The scene is then synthesized using these labels and the extracted geometric features. Our system is able to predict more than 17 classes only by understanding the provided illustrative labels. For the remaining objects, their geometric features are transmitted. The labels and the geometric features are mapped on a synthesis area to be sensed by the touch sense. Experiments are conducted on noisy and incomplete data including acquired depth images of indoor scenes and public datasets. The obtained results are reported and discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

References

  1. Jafri, R., Ali, S.A., Arabnia, H.R., Fatima, S.: Computer vision-based object recognition for the visually impaired in an indoors environment: a survey. Vis. Comput. 30(11), 1197–1222 (2014)

    Article  Google Scholar 

  2. Leo, M., Medioni, G., Trivedi, M., Kanade, T., Farinella, G.: Computer vision for assistive technologies. Comput. Vis. Image Underst. 154, 1–15 (2017)

    Article  Google Scholar 

  3. Tapu, R., Mocanu, B., Zaharia, T.: Wearable assistive devices for visually impaired: a state of the art survey. Pattern Recogn. Lett. (2018). https://doi.org/10.1016/j.patrec.2018.10.031

    Article  Google Scholar 

  4. Cardin, S., Thalmann, D., Vexo, F.: A wearable system for mobility improvement of visually impaired people. Vis. Comput. 23(2), 109–118 (2007)

    Article  Google Scholar 

  5. Cupec, R., Vidovic, I., Filko, D., Durovic, P.: Object recognition based on convex hull alignment. Pattern Recogn. 102, 107199 (2020)

    Article  Google Scholar 

  6. Wang, H., Katzschmann, R.K., Teng, S., Araki, B., Giarré, L., Rus, D.: Enabling independent navigation for visually impaired people through a wearable vision-based feedback system. In: IEEE International Conference on Robotics and Automation (ICRA), vol. 2017, pp. 6533–6540. IEEE (2017)

  7. Perez-Yus, A., Gutierrez-Gomez, D., Lopez-Nicolas, G., Guerrero, J.: Stairs detection with odometry-aided traversal from a wearable rgb-d camera. Comput. Vis. Image Underst. 154, 192–205 (2017)

    Article  Google Scholar 

  8. Lin, Y., Wang, K., Yi, W., Lian, S.: Deep learning based wearable assistive system for visually impaired people. In: Proceedings of the IEEE International Conference on Computer Vision Workshops (2019)

  9. Hazirbas, C., Ma, L., Domokos, C., Cremers, D.: Fusenet: incorporating depth into semantic segmentation via fusion-based cnn architecture. In: Asian Conference on Computer Vision, pp. 213–228. Springer (2016)

  10. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)

  11. Wang, Z., Liu, H., Wang, X., Qian, Y.: Segment and label indoor scene based on rgb-d for the visually impaired. In: International Conference on Multimedia Modeling, pp. 449–460. Springer (2014)

  12. Bai, J., Liu, Z., Lin, Y., Li, Y., Lian, S., Liu, D.: Wearable travel aid for environment perception and navigation of visually impaired people. Electronics 8(6), 697 (2019)

    Article  Google Scholar 

  13. Rivera-Rubio, J., Arulkumaran, K., Rishi, H., Alexiou, I., Bharath, A.A.: An assistive haptic interface for appearance-based indoor navigation. Comput. Vis. Image Underst. 149, 126–145 (2016)

    Article  Google Scholar 

  14. Zatout, C., Larabi, S., Mendili, I., Barnabé, S.A.E.: Ego-semantic labeling of scene from depth image for visually impaired and blind people. In: 2019 IEEE/CVF International Conference on Computer Vision Workshops, ICCV Workshops 2019, Seoul, Korea (South), October 27–28, 2019, pp. 4376–4384. IEEE (2019)

  15. Jeamwatthanachai, W., Wald, M., Wills, G.: Map data representation for indoor navigation by blind people. Int. J. Chaotic Comput. 4(1), 70–78 (2017)

    Article  Google Scholar 

  16. Bai, J., Lian, S., Liu, Z., Wang, K., Liu, D.: Smart guiding glasses for visually impaired people in indoor environment. IEEE Trans. Consum. Electron. 63(3), 258–266 (2017)

    Article  Google Scholar 

  17. Xie, Y., Tian, J., Zhu, X.X.: A review of point cloud semantic segmentation. arXiv:1908.08854

  18. Kuçak, R., Özdemir, E., Erol, S.: The segmentation of point clouds with k-means and ann (artifical neural network). Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 42, 595 (2017)

    Article  Google Scholar 

  19. Aparajithan, S., Jie, S.: Clustering based planar roof extraction from lidar data. In: ASPRS 2006 Annual Conference (2006)

  20. Melzer, T.: Non-parametric segmentation of als point clouds using mean shift. J. Appl. Geod. Jag 1(3), 159–170 (2007)

    Google Scholar 

  21. Biosca, J.M., Lerma, J.L.: Unsupervised robust planar segmentation of terrestrial laser scanner point clouds based on fuzzy clustering methods. ISPRS J. Photogramm. Remote Sens. 63(1), 84–98 (2008)

    Article  Google Scholar 

  22. Czerniawski, T., Sankaran, B., Nahangi, M., Haas, C., Leite, F.: 6d dbscan-based segmentation of building point clouds for planar object classification. Autom. Constr. 88, 44–58 (2018)

    Article  Google Scholar 

  23. Wu, B., Wan, A., Yue, X., Keutzer, K.: Squeezeseg: convolutional neural nets with recurrent crf for real-time road-object segmentation from 3d lidar point cloud. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 1887–1893. IEEE (2018)

  24. Wang, C., Ji, M., Wang, J., Wen, W., Li, T., Sun, Y.: An improved dbscan method for lidar data segmentation with automatic eps estimation. Sensors 19(1), 172 (2019)

    Article  Google Scholar 

  25. Kumawat, S., Raman, S.: Lp-3dcnn: unveiling local phase in 3d convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4903–4912 (2019)

  26. Maturana, D., Scherer, S.: Voxnet: a 3d convolutional neural network for real-time object recognition. In: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 922–928. IEEE (2015)

  27. Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., Xiao, J.: 3d shapenets: a deep representation for volumetric shapes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1912–1920 (2015)

  28. Liu, Y., Fan, B., Xiang, S., Pan, C.: Relation-shape convolutional neural network for point cloud analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8895–8904 (2019)

  29. Li, J., Chen, B. M., Hee Lee, G.: So-net: self-organizing network for point cloud analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9397–9406 (2018)

  30. Qi, C.R., Yi, L., Su, H., et al.: Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, pp. 5105–5114 (2017)

  31. Qi, C. R., Su, H., Mo, K., Guibas, L.J.: Pointnet: deep learning on point sets for 3d classification and segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 652–660 (2017)

  32. Ramasinghe, S., Khan, S., Barnes, N., Gould, S.: Spectral-gans for high-resolution 3d point-cloud generation. arXiv:1912.01800, 2019

  33. Milz, S., Simon, M., Fischer, K., Pöpperl, M., Gross, H.-M.: Points2pix: 3d point-cloud to image translation using conditional gans. In: German Conference on Pattern Recognition, pp. 387–400. Springer (2019)

  34. Zatout, C., Larabi, S.: Dataset of ground detection for indoor scenes (GDIS dataset) (2019). http://perso.usthb.dz/ slarabi/DGIS.html

Download references

Funding

This research was not funded.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Slimane Larabi.

Ethics declarations

Conflict of interest

The authors Chayma Zatout and Slimane Larabi declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zatout, C., Larabi, S. Semantic scene synthesis: application to assistive systems. Vis Comput 38, 2691–2705 (2022). https://doi.org/10.1007/s00371-021-02147-w

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-021-02147-w

Keywords

Navigation