IOS-Net: An Inside-to-outside Supervision Network for Scale Robust Text Detection in the wild,Pattern Recognition

当前位置： X-MOL 学术 › Pattern Recogn. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

IOS-Net: An Inside-to-outside Supervision Network for Scale Robust Text Detection in the wild
Pattern Recognition ( IF 8 ) Pub Date : 2020-07-01 , DOI: 10.1016/j.patcog.2020.107304
Yuanqiang Cai , Weiqiang Wang , Yuting Chen , Qixiang Ye

Abstract Accurately detecting scene text is a challenging task due to perspective distortion, scale variance, varied orientations, uneven illumination. Among them, scale variance has always been a core issue and generally involves two types: various size and diverse aspect ratios of the text regions. In contrast to most existing approaches focusing on addressing one type of scale variance, this paper presents a novel inside-to-outside supervision network (IOS-Net) that can well tackle both two. Specifically, we design a hierarchical supervision module (HSM), which consists of a new inception unit with parallel asymmetric convolution and a skip-layer fusion structure. Inside the HSM, we introduce hierarchical supervision into the new inception unit to effectively capture the texts with diverse aspect ratios. Outside the HSM, we adopt multiple-scale supervision on the stacked HSMs to accurately detect the texts with various sizes. Moreover, a position-sensitive segmentation is used to enhance the representation of difficult text objects and the discrimination of adjacent ones. The proposed method achieves state-of-the-art performance on representative public benchmarks, reaching 86% F-score and 11.5 frames per second (FPS) on the ICDAR 2015 incidental text dataset, 47% F-score and 16.1 FPS on the COCO-Text dataset, 69% F-score and 11.7 FPS on the ICDAR 2013 video text dataset.

中文翻译：

IOS-Net：用于大规模鲁棒文本检测的内到外监督网络

摘要由于透视失真、尺度变化、方向不同、光照不均匀，准确检测场景文本是一项具有挑战性的任务。其中，尺度方差一直是一个核心问题，一般涉及两种类型：文本区域的各种大小和不同的纵横比。与大多数专注于解决一种规模差异的现有方法相比，本文提出了一种新颖的内部到外部监督网络 (IOS-Net)，可以很好地解决这两种情况。具体来说，我们设计了一个分层监督模块（HSM），它由一个具有并行非对称卷积和跳过层融合结构的新初始单元组成。在 HSM 内部，我们将分层监督引入新的初始单元，以有效捕获具有不同纵横比的文本。在 HSM 之外，我们在堆叠的 HSM 上采用多尺度监督来准确检测各种大小的文本。此外，使用位置敏感分割来增强困难文本对象的表示和相邻对象的区分。所提出的方法在有代表性的公共基准测试中达到了最先进的性能，在 ICDAR 2015 附带文本数据集上达到了 86% 的 F-score 和每秒 11.5 帧（FPS），在 COCO 上达到了 47% 的 F-score 和 16.1 FPS -文本数据集，在 ICDAR 2013 视频文本数据集上的 F-score 为 69%，FPS 为 11.7 FPS。

更新日期：2020-07-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>