当前位置: X-MOL 学术IEEE Trans. Image Process. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Text Co-Detection in Multi-View Scene
IEEE Transactions on Image Processing ( IF 10.6 ) Pub Date : 2020-02-21 , DOI: 10.1109/tip.2020.2973511
Chuan Wang , Huazhu Fu , Liang Yang , Xiaochun Cao

Multi-view scene analysis has been widely explored in computer vision, including numerous practical applications. The texts in multi-view scenes are often detected by following the existing text detection method in a single image, which however ignores the multi-view corresponding constraint. The multi-view correspondences may contain structure, location information and assist difficulties induced by factors like occlusion and perspective distortion, which are deficient in the single image scene. In this paper, we address the corresponding text detection task and propose a novel text co-detection method to identify the co-occurring texts among multi-view scene images with compositions of detection and correspondence under large environmental variations. In our text co-detection method, the visual and geometrical correspondences are designed to explore texts holding high pairwise representation similarity and guide the exploitation of texts with geometrical correspondences, simultaneously. To guarantee the pairwise consistency among multiple images, we additionally incorporate the cycle consistency constraint, which guarantees alignments of text correspondences in the image set. Finally, text correspondence is represented by a permutation matrix and solved via positive semidefinite and low-rank constraints. Moreover, we also collect a new text co-detection dataset consisting of multi-view image groups obtained from the same scene with different photographing conditions. The experiments show that our text co-detection obtains satisfactory performance and outperforms the related state-of-the-art text detection methods.

中文翻译:

多视图场景中的文本共检测

在计算机视觉中已经广泛探索了多视图场景分析,包括许多实际应用。多视图场景中的文本通常通过遵循单个图像中现有的文本检测方法来检测,但是该方法忽略了多视图对应的约束。多视图对应关系可以包含结构,位置信息,并可以辅助因单个图像场景中缺乏的诸如遮挡和透视变形之类的因素而引起的困难。在本文中,我们解决了相应的文本检测任务,并提出了一种新颖的文本共检测方法,该方法可以在环境变化较大的情况下,通过检测和对应的成分来识别多视图场景图像中的共现文本。在我们的文本共检测方法中,视觉和几何对应关系旨在探索具有高度成对表示相似性的文本,并同时指导对具有几何对应关系的文本的开发。为了保证多幅图像之间的成对一致性,我们还添加了循环一致性约束,以保证图像集中文本对应关系的对齐。最后,文本对应关系由置换矩阵表示,并通过正半定和低秩约束进行求解。此外,我们还收集了一个新的文本共检测数据集,该数据集由从同一场景获得的具有不同拍摄条件的多视图图像组组成。实验表明,我们的文本共检测获得令人满意的性能,并且优于相关的最新文本检测方法。
更新日期:2020-04-22
down
wechat
bug