Skip to main content
Log in

Improving offline handwritten Chinese text recognition with glyph-semanteme fusion embedding

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

In this paper, we propose the Glyph-Semanteme fusion Embedding (GSE) for Chinese character and apply it to Offline Handwritten Chinese Text Recognition (offline-HCTR). It is well known that the number of Chinese characters is very large and the glyphs of these characters are complex, but few researchers realize that the underlying reason for this phenomenon is that Chinese is a form of ideogram, which indicates that there are correlations between the glyph and semanteme of a character. In order to utilize this feature and create better representations for Chinese characters, firstly, we extract the glyph embedding and semanteme embedding for each Chinese character; then we propose a parameterized gated fusion strategy to automatically calculate the Glyph-Semanteme fusion Embedding for each character by fusing its glyph embedding and semanteme embedding. We apply the proposed GSE to an attention-based Encoder-decoder network for the offline-HCTR task. Furthermore, two kinds of GSE, Character-level GSE (CGSE) and Text-level GSE (TGSE), are applied to the decoder phase to yield the predictions. On the standard benchmark ICDAR-2013 HCTR competition dataset, the proposed method achieves 96.65% character-level recognition accuracy, which demonstrates the effectiveness of the proposed glyph-semanteme fusion embedding.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. https://ieeexplore.ieee.org/document/8978158.

  2. http://www.nlpr.ia.ac.cn/databases/handwriting/Home.html.

  3. http://tcci.ccf.org.cn/conference/2017/taskdata.php.

  4. https://dumps.wikimedia.org/zhwiki/latest/zhwiki-latest-pages-articles.xml.bz2.

  5. https://github.com/KaimingHe/deep-residual-networks.

  6. https://www.tensorflow.org/.

References

  1. Bahdanau D, Cho K, Bengio Y (2015) Neural machine translation by jointly learning to align and translate. In: 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings

  2. Bai F, Cheng Z, Niu Y, Pu S, Zhou S (2018) Edit probability for scene text recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1508–1516

  3. Devlin J, Chang MW, Lee K, Toutanova K (2018) Bert: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

  4. Du J, Wang Z, Zhai J, Hu J (2016) Deep neural network based hidden Markov model for offline handwritten Chinese text recognition. In: 23rd International Conference on Pattern Recognition, ICPR 2016, Cancún, Mexico, December 4–8, 2016, pp 3428–3433

  5. Graves A, Fernández S, Gomez F, Schmidhuber J (2006) Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In: the 23rd International Conference on Machine learning, pp 369–376

  6. Greff K, Srivastava RK, Schmidhuber J (2017) Highway and residual networks learn unrolled iterative estimation. In: 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24–26, 2017, Conference Track Proceedings

  7. Gu J, Wang G, Cai J, Chen T (2017) An empirical study of language CNN for image captioning. In: Proceedings of the International Conference on Computer Vision (ICCV)

  8. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition,CVPR 2016, Las Vegas, NV, USA, June 27–30, 2016, pp 770–778

  9. Karpathy A, Fei-Fei L (2015) Deep visual-semantic alignments for generating image descriptions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3128–3137

  10. Karpathy A, Joulin A, Li F (2014) Deep fragment embeddings for bidirectional image sentence mapping. In: Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8–13 2014, Montreal, Quebec, Canada, pp 1889–1897

  11. Kim S, Dalmia S, Metze F (2019) Gated embeddings in end-to-end speech recognition for conversational-context fusion. In: Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019, Florence, Italy, July 28–August 2, 2019, Volume 1: Long Papers, pp 1131–1141

  12. Liu C, Sako H, Fujisawa H (2004) Effects of classifier structures and training regimes on integrated segmentation and recognition of handwritten numeral strings. IEEE Trans Pattern Anal Mach Intell 26(11):1395–1407

    Article  Google Scholar 

  13. Liu C, Yin F, Wang D, Wang Q (2013) Online and offline handwritten Chinese character recognition: benchmarking on new databases. Pattern Recognit 46(1):155–162

    Article  Google Scholar 

  14. Luo C, Jin L, Sun Z (2019) Moran: a multi-object rectified attention network for scene text recognition. Pattern Recogn 90:109–118

    Article  Google Scholar 

  15. Messina R, Louradour J (2015) Segmentation-free handwritten Chinese text recognition with lSTM-RNN. In: Document Analysis and Recognition (ICDAR), 2015 13th International Conference on, pp 171–175. IEEE

  16. Ranzato M, Chopra S, Auli M, Zaremba W (2016)Sequence level training with recurrent neural networks. In: Y. Bengio, Y. LeCun (eds.) 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2–4, 2016, Conference Track Proceedings

  17. Shi B, Bai X, Yao C (2017) An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans Pattern Anal Mach Intell 39(11):2298–2304

    Article  Google Scholar 

  18. Shi B, Bai X, Yao C (2017) An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans Pattern Anal Mach Intell 39(11):2298–2304

    Article  Google Scholar 

  19. Srivastava RK, Greff K, Schmidhuber J (2015) Training very deep networks. In: Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7–12, 2015, Montreal, Quebec, Canada, pp 2377–2385

  20. Su TH, Zhang TW, Guan DJ, Huang HJ (2009) Off-line recognition of realistic Chinese handwriting using segmentation-free strategy. Pattern Recognit 42(1):167–182

    Article  Google Scholar 

  21. Wang QF, Yin F, Liu CL (2014) Unsupervised language model adaptation for handwritten Chinese text recognition. Pattern Recognit 47(3):1202–1216

    Article  Google Scholar 

  22. Wang W, Du J, Wang Z (2018) Parsimonious HMMS for offline handwritten Chinese text recognition. In: 16th International Conference on Frontiers in Handwriting Recognition, ICFHR 2018, Niagara Falls, NY, USA, August 5–8, 2018, pp 145–150

  23. Wang Z, Du J, Wang J (2020) Writer-aware CNN for parsimonious HMM-based offline handwritten Chinese text recognition. Pattern Recognit 100:107102

    Article  Google Scholar 

  24. Wang Z, Du J, Wang W, Zhai J, Hu J (2018) A comprehensive study of hybrid neural network hidden Markov model for offline handwritten Chinese text recognition. Int J Doc Anal Recognit 21(4):241–251

    Article  Google Scholar 

  25. Wei X, Lu S, Wen Y, Lu Y (2016) Recognition of handwritten Chinese address with writing variations. Pattern Recognit Lett 73:68–75

    Article  Google Scholar 

  26. Weston J, Chopra S, Bordes A (2015) Memory networks. In: Bengio Y, LeCun Y (eds) 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7–9, 2015, Conference Track Proceedings

  27. Wu Y, Yin F, Liu C (2017) Improving handwritten Chinese text recognition using neural network language models and convolutional neural network shape models. Pattern Recogn 65:251–264

    Article  Google Scholar 

  28. Wu YC, Yin F, Chen Z, Liu CL (2017) Handwritten Chinese text recognition using separable multi-dimensional recurrent neural network. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol 1, pp 79–84. IEEE

  29. Wu YC, Yin F, Liu CL (2017) Improving handwritten Chinese text recognition using neural network language models and convolutional neural network shape models. Pattern Recognit 65:251–264

    Article  Google Scholar 

  30. Xie Z, Huang Y, Zhu Y, Jin L, Liu Y, Xie L (2019) Aggregation cross-entropy for sequence recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16–20, 2019, pp 6538–6547

  31. Yang H, Jin L, Sun J (2018) Recognition of Chinese text in historical documents with page-level annotations. In: 16th International Conference on Frontiers in Handwriting Recognition, ICFHR 2018, Niagara Falls, NY, USA, August 5–8, 2018, pp 199–204

  32. Zeiler MD (2012) Adadelta: an adaptive learning rate method. CoRR. arXiv:1212.5701

  33. Zhan H, Lyu S, Lu Y (2018) Improving off-line handwritten Chinese character recognition with semantic information. In: Neural Information Processing - 25th International Conference, ICONIP 2018, Siem Reap, Cambodia, December 13–16, 2018, Proceedings, Part V, Lecture Notes in Computer Science, vol 11305, pp 528–536

  34. Zhang B, Xiong D, Su J (2020) Neural machine translation with deep attention. IEEE Trans Pattern Anal Mach Intell 42(1):154–163

    Article  Google Scholar 

  35. Zilly JG, Srivastava RK, Koutník J, Schmidhuber J (2017) Recurrent highway networks. In: Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6–11 August 2017, Proceedings of Machine Learning Research, vol 70, pp 4189–4198

Download references

Acknowledgements

This work is supported by Natural Science Foundation of Shanghai (No. 19ZR1415900).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shujing Lyu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhan, H., Lyu, S. & Lu, Y. Improving offline handwritten Chinese text recognition with glyph-semanteme fusion embedding. Int. J. Mach. Learn. & Cyber. 13, 485–496 (2022). https://doi.org/10.1007/s13042-021-01420-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-021-01420-7

Keywords

Navigation