当前位置: X-MOL 学术Pattern Recogn. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Radical Analysis Network for Learning Hierarchies of Chinese Characters
Pattern Recognition ( IF 7.5 ) Pub Date : 2020-07-01 , DOI: 10.1016/j.patcog.2020.107305
Jianshu Zhang , Jun Du , Lirong Dai

Abstract Chinese characters have a valuable property, this is, numerous Chinese characters are composed of a compact set of fundamental and structural radicals. This paper introduces a radical analysis network (RAN) that makes full use of this valuable property to implement radical-based Chinese character recognition. The proposed RAN employs an attention mechanism to extract radicals from Chinese characters and to detect spatial structures among the radicals. Then, the decoder in RAN generates a hierarchical composition of Chinese characters based on the knowledge of the extracted radicals and their internal structures. The method of treating a Chinese character as a composition of radicals rather than as a single character category is a human-like method that can reduce the size of the vocabulary, ignore redundant information among similar characters and enable the system to recognize unseen Chinese character categories, i.e., zero-shot learning. Through experiments, we assess the practicality of RAN for recognizing Chinese characters in natural scenes. Furthermore, a RAN framework can be proposed for scene text recognition with the extension of a dense recurrent neural network (denseRNN) encoder, a multihead coverage attention model and HSV representations. The proposed approach achieved the best performance in the ICPR MTWI 2018 competition.

中文翻译:

汉字学习层次的部首分析网络

摘要 汉字有一个宝贵的特性,即众多的汉字是由一组紧凑的基本和结构部首组成的。本文介绍了一种部首分析网络(RAN),它充分利用了这一宝贵的特性来实现基于部首的汉字识别。提出的RAN采用注意机制从汉字中提取部首并检测部首之间的空间结构。然后,RAN 中的解码器根据提取的部首及其内部结构的知识生成汉字的层次结构。把一个汉字看成是部首的组合而不是一个单一的字类的方法,是一种可以减少词汇量的类人方法,忽略相似字符之间的冗余信息,使系统能够识别未见过的汉字类别,即零样本学习。通过实验,我们评估了 RAN 在自然场景中识别汉字的实用性。此外,可以通过密集循环神经网络 (denseRNN) 编码器、多头覆盖注意模型和 HSV 表示的扩展,提出用于场景文本识别的 RAN 框架。所提出的方法在 ICPR MTWI 2018 竞赛中取得了最佳性能。通过扩展密集循环神经网络 (denseRNN) 编码器、多头覆盖注意模型和 HSV 表示,可以提出 RAN 框架用于场景文本识别。所提出的方法在 ICPR MTWI 2018 竞赛中取得了最佳性能。通过扩展密集循环神经网络 (denseRNN) 编码器、多头覆盖注意模型和 HSV 表示,可以提出 RAN 框架用于场景文本识别。所提出的方法在 ICPR MTWI 2018 竞赛中取得了最佳性能。
更新日期:2020-07-01
down
wechat
bug