Real-Time Lexicon-Free Scene Text Retrieval,Pattern Recognition

当前位置： X-MOL 学术 › Pattern Recogn. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Real-Time Lexicon-Free Scene Text Retrieval
Pattern Recognition ( IF 7.5 ) Pub Date : 2021-02-01 , DOI: 10.1016/j.patcog.2020.107656
Andrés Mafla , Rubèn Tito , Sounak Dey , Lluís Gómez , Marçal Rusiñol , Ernest Valveny , Dimosthenis Karatzas

Abstract In this work, we address the task of scene text retrieval: given a text query, the system returns all images containing the queried text. The proposed model uses a single shot CNN architecture that predicts bounding boxes and builds a compact representation of spotted words. In this way, this problem can be modeled as a nearest neighbor search of the textual representation of a query over the outputs of the CNN collected from the totality of an image database. Our experiments demonstrate that the proposed model outperforms previous state-of-the-art, while offering a significant increase in processing speed and unmatched expressiveness with samples never seen at training time. Several experiments to assess the generalization capability of the model are conducted in a multilingual dataset, as well as an application of real-time text spotting in videos.

中文翻译：

实时无词典场景文本检索

摘要在这项工作中，我们解决了场景文本检索的任务：给定一个文本查询，系统返回包含查询文本的所有图像。所提出的模型使用单次 CNN 架构来预测边界框并构建被点词的紧凑表示。通过这种方式，这个问题可以被建模为对从整个图像数据库收集的 CNN 输出的查询的文本表示的最近邻搜索。我们的实验表明，所提出的模型优于以前的最新技术，同时显着提高了处理速度和无与伦比的表现力，而这些样本在训练时从未见过。在多语言数据集中进行了几个评估模型泛化能力的实验，

更新日期：2021-02-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11