当前位置: X-MOL 学术Vis. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Semantic scene synthesis: application to assistive systems
The Visual Computer ( IF 3.5 ) Pub Date : 2021-05-03 , DOI: 10.1007/s00371-021-02147-w
Chayma Zatout , Slimane Larabi

The aim of this work is to provide a semantic scene synthesis from a single depth image. This is used in assistive aid systems for visually impaired and blind people that allows them to understand their surroundings by the touch sense. The fact that blind people use touch to recognize objects and rely on listening to replace sight motivated us to propose this work. First, the acquired depth image is segmented and each segment is classified in the context of assistive systems using a deep learning network. Second, inspired by the Braille system and the Japanese writing system Kanji, the obtained classes are coded with semantic labels. The scene is then synthesized using these labels and the extracted geometric features. Our system is able to predict more than 17 classes only by understanding the provided illustrative labels. For the remaining objects, their geometric features are transmitted. The labels and the geometric features are mapped on a synthesis area to be sensed by the touch sense. Experiments are conducted on noisy and incomplete data including acquired depth images of indoor scenes and public datasets. The obtained results are reported and discussed.



中文翻译:

语义场景综合:在辅助系统中的应用

这项工作的目的是从单个深度图像中提供语义场景合成。它用于视力障碍者和盲人的辅助系统中,使他们能够通过触觉了解周围环境。盲人使用触摸来识别物体并依靠听觉代替视力这一事实促使我们提出这项工作。首先,对获取的深度图像进行分割,并使用深度学习网络在辅助系统的上下文中对每个片段进行分类。其次,受盲文系统和日语书写系统汉字的启发,所获得的类使用语义标签进行编码。然后使用这些标签和提取的几何特征来合成场景。仅通过理解提供的说明性标签,我们的系统才能预测17个以上的班级。对于其余对象,将传输其几何特征。标签和几何特征被映射在要由触摸感测到的合成区域上。针对嘈杂和不完整的数据(包括获取的室内场景深度图像和公共数据集)进行了实验。报告并讨论了获得的结果。

更新日期:2021-05-03
down
wechat
bug