Writer-Aware CNN for Parsimonious HMM-Based Offline Handwritten Chinese Text Recognition,Pattern Recognition

当前位置： X-MOL 学术 › Pattern Recogn. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Writer-Aware CNN for Parsimonious HMM-Based Offline Handwritten Chinese Text Recognition
Pattern Recognition ( IF 7.5 ) Pub Date : 2020-04-01 , DOI: 10.1016/j.patcog.2019.107102
Zi-Rui Wang , Jun Du , Jia-Ming Wang

Recently, the hybrid convolutional neural network hidden Markov model (CNN-HMM) has been introduced for offline handwritten Chinese text recognition (HCTR) and has achieved state-of-the-art performance. However, modeling each of the large vocabulary of Chinese characters with a uniform and fixed number of hidden states requires high memory and computational costs and makes the tens of thousands of HMM state classes confusing. Another key issue of CNN-HMM for HCTR is the diversified writing style, which leads to model strain and a significant performance decline for specific writers. To address these issues, we propose a writer-aware CNN based on parsimonious HMM (WCNN-PHMM). First, PHMM is designed using a data-driven state-tying algorithm to greatly reduce the total number of HMM states, which not only yields a compact CNN by state sharing of the same or similar radicals among different Chinese characters but also improves the recognition accuracy due to the more accurate modeling of tied states and the lower confusion among them. Second, WCNN integrates each convolutional layer with one adaptive layer fed by a writer-dependent vector, namely, the writer code, to extract the irrelevant variability in writer information to improve recognition performance. The parameters of writer-adaptive layers are jointly optimized with other network parameters in the training stage, while a multiple-pass decoding strategy is adopted to learn the writer code and generate recognition results. Validated on the ICDAR 2013 competition of CASIA-HWDB database, the more compact WCNN-PHMM of a 7360-class vocabulary can achieve a relative character error rate (CER) reduction of 16.6% over the conventional CNN-HMM without considering language modeling. By adopting a powerful hybrid language model (N-gram language model and recurrent neural network language model), the CER of WCNN-PHMM is reduced to 3.17%.

中文翻译：

用于基于简约 HMM 的离线手写中文文本识别的 Writer-Aware CNN

最近，混合卷积神经网络隐马尔可夫模型（CNN-HMM）被引入离线手写中文文本识别（HCTR）并取得了最先进的性能。然而，用统一和固定数量的隐藏状态对每个庞大的汉字词表进行建模需要高内存和计算成本，并使数以万计的 HMM 状态类变得混乱。用于 HCTR 的 CNN-HMM 的另一个关键问题是多样化的写作风格，这导致模型紧张和特定作家的显着性能下降。为了解决这些问题，我们提出了一种基于简约 HMM (WCNN-PHMM) 的作家感知 CNN。首先，PHMM 是使用数据驱动的状态绑定算法设计的，以大大减少 HMM 状态的总数，这不仅通过在不同汉字之间共享相同或相似部首的状态来产生紧凑的 CNN，而且由于对绑定状态的更准确建模和它们之间的较低混淆，提高了识别精度。其次，WCNN 将每个卷积层与一个由作者相关向量（即作者代码）馈送的自适应层集成，以提取作者信息中不相关的可变性以提高识别性能。编写器自适应层的参数在训练阶段与其他网络参数联合优化，同时采用多遍解码策略学习编写器代码并生成识别结果。在CASIA-HWDB数据库的ICDAR 2013竞赛中得到验证，7360 类词汇的更紧凑的 WCNN-PHMM 可以在不考虑语言建模的情况下实现比传统 CNN-HMM 降低 16.6% 的相对字符错误率 (CER)。通过采用强大的混合语言模型（N-gram语言模型和循环神经网络语言模型），WCNN-PHMM的CER降低到3.17%。

更新日期：2020-04-01

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11