当前位置: X-MOL 学术Int. J. Comput. Vis. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Bottom-Up Scene Text Detection with Markov Clustering Networks
International Journal of Computer Vision ( IF 11.6 ) Pub Date : 2020-02-10 , DOI: 10.1007/s11263-020-01298-y
Zichuan Liu , Guosheng Lin , Wang Ling Goh

A novel detection framework named Markov Clustering Network (MCN) is proposed for fast and robust scene text detection. Different from the traditional top-down scene text detection approaches that inherit from the classic object detection, MCN detects scene text objects in a bottom-up manner. MCN predicts instance-level bounding boxes by firstly converting an image into a stochastic flow graph where Markov Clustering is performed based on the predicted stochastic flows. The stochastic flows encode the local correlation and semantic information of scene text objects. An object is modeled as strongly connected nodes by flows, which allows flexible and bottom-up detection for scale-varying and rotated text objects without prior knowledge of object size. The flow prediction is supported by the advanced Convolutional Neural Networks architectures and Position-aware spatial attention mechanism, which provides enhanced flow prediction by adaptively fusing spatial representations. The experimental evaluation on public benchmarks shows that our MCN method achieves the state-of-art performance on public benchmarks, especially in retrieving long and oriented texts.

中文翻译:

使用马尔可夫聚类网络进行自下而上的场景文本检测

提出了一种名为马尔可夫聚类网络 (MCN) 的新型检测框架,用于快速、稳健的场景文本检测。与继承经典对象检测的传统自顶向下场景文本检测方法不同,MCN 以自底向上的方式检测场景文本对象。MCN 通过首先将图像转换为随机流图来预测实例级边界框,其中基于预测的随机流执行马尔可夫聚类。随机流对场景文本对象的局部相关性和语义信息进行编码。对象通过流被建模为强连接节点,这允许灵活和自下而上地检测缩放变化和旋转的文本对象,而无需事先了解对象大小。流预测得到先进的卷积神经网络架构和位置感知空间注意机制的支持,该机制通过自适应融合空间表示提供增强的流预测。公共基准的实验评估表明,我们的 MCN 方法在公共基准上达到了最先进的性能,尤其是在检索长文本和定向文本方面。
更新日期:2020-02-10
down
wechat
bug