Manifesting construction activity scenes via image captioning,Automation in Construction

当前位置： X-MOL 学术 › Autom. Constr. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Manifesting construction activity scenes via image captioning
Automation in Construction ( IF 9.6 ) Pub Date : 2020-11-01 , DOI: 10.1016/j.autcon.2020.103334
Huan Liu , Guangbin Wang , Ting Huang , Ping He , Martin Skitmore , Xiaochun Luo

This study proposed an automated method for manifesting construction activity scenes by image captioning – an approach rooted in computer vision and natural language generation. A linguistic description schema for manifesting the scenes is developed initially and two unique dedicated image captioning datasets are created for method validation. A general model architecture of image captioning is then instituted by combining an encoder-decoder framework with deep neural networks, followed by three experimental tests involving the selection of model learning strategies and performance evaluation metrics. It is demonstrated the method's performance is comparable with that of state-of-the-art computer vision methods in general. The paper concludes with a discussion of the feasibility of the practical application of the proposed approach at the current technical level.

中文翻译：

通过图像字幕显示施工活动场景

这项研究提出了一种通过图像字幕显示施工活动场景的自动化方法——一种植根于计算机视觉和自然语言生成的方法。最初开发了用于表现场景的语言描述模式，并创建了两个独特的专用图像字幕数据集用于方法验证。然后通过将编码器-解码器框架与深度神经网络相结合，建立图像字幕的通用模型架构，然后进行三个实验测试，包括模型学习策略和性能评估指标的选择。证明该方法的性能与一般的最先进计算机视觉方法相当。

更新日期：2020-11-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11