当前位置: X-MOL 学术Pattern Recogn. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Zero-Shot Handwritten Chinese Character Recognition with Hierarchical Decomposition Embedding
Pattern Recognition ( IF 8 ) Pub Date : 2020-11-01 , DOI: 10.1016/j.patcog.2020.107488
Zhong Cao , Jiang Lu , Sen Cui , Changshui Zhang

Abstract Handwritten Chinese Character Recognition (HCCR) is a challenging topic in the field of pattern recognition due to large-scale character vocabulary, complex hierarchical structure, various writing styles, and scarce training samples. In this paper, we explored the hierarchical knowledge of Chinese characters and presented a novel zero-shot HCCR method. First, we handled the relations between the characters and their primitives, such as radicals and structures, to obtain a tree layout of primitives. Then, we presented a novel zero-shot hierarchical decomposition embedding method to encode the tree layout into a semantic vector. Next, we devised a Convolutional Neural Network (CNN) based framework to learn both radicals and structures of characters via the semantic vector. As different Chinese characters share some common radicals and structures, our method is able to recognize new categories without any labeled samples from them. Moreover, our method is effective in both traditional HCCR and zero-shot HCCR tasks. It achieves competitive performance on the traditional experiment setting and significantly surpasses the state-of-the-art methods on the zero-shot experiment setting.

中文翻译:

分层分解嵌入的零镜头手写汉字识别

摘要 由于汉字词汇量大、层次结构复杂、书写风格多样、训练样本稀少,手写汉字识别(HCCR)是模式识别领域的一个具有挑战性的课题。在本文中,我们探索了汉字的层次知识,并提出了一种新颖的零样本 HCCR 方法。首先,我们处理了字符与其基元之间的关系,例如部首和结构,以获得基元的树状布局。然后,我们提出了一种新颖的零样本分层分解嵌入方法,将树布局编码为语义向量。接下来,我们设计了一个基于卷积神经网络 (CNN) 的框架,通过语义向量来学习字符的部首和结构。由于不同的汉字共享一些共同的部首和结构,我们的方法能够在没有任何标记样本的情况下识别新类别。此外,我们的方法在传统的 HCCR 和零样本 HCCR 任务中都是有效的。它在传统实验设置上实现了有竞争力的性能,并在零样本实验设置上显着超越了最先进的方法。
更新日期:2020-11-01
down
wechat
bug