Skip to main content
Log in

Text recognition for Vietnamese identity card based on deep features network

  • Original Paper
  • Published:
International Journal on Document Analysis and Recognition (IJDAR) Aims and scope Submit manuscript

Abstract

Optical character recognition (OCR) is a technology for converting text automatically on images into data strings for editing, indexing, and searching. The strings can be applied for many tasks such as to digitize old documents, translate into other languages, or to test and verify text positions. Recently, Know Your Customer (KYC) has become an industry standard for making sure that people are who they say they are. While the scope of Know Your Customer is constantly expanding, ID verification is still a crucial first step in KYC processes. Mobile OCR is one of the technological solutions that is making this part of KYC easier than ever for customers to comply with. KYC processes require financial services companies to verify the identities of their customers OCR to extract data by reading IDs, bank cards, and documents. In this paper, we investigate to develop a method for Vietnamese identity card recognition based on deep features network. On several major data fields of identity cards, it achieves an accuracy of more than 96.7% and 89.7% on character level and word level, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. https://github.com/eragonruan/text-detection-ctpn.

  2. https://cloud.google.com/vision/docs/ocr.

  3. https://intl.cloud.tencent.com/product/ocr.

References

  1. Arafat, S.Y., Iqbal, M.J.: Urdu-text detection and recognition in natural scene images using deep learning. IEEE Access 8, 96787–96803 (2020)

    Article  Google Scholar 

  2. Attivissimo, F., Giaquinto, N., Scarpetta, M., Spadavecchia, M.: An Automatic Reader of Identity Documents. In: 2019 IEEE International Conference on Systems. Man and Cybernetics (SMC), pp. 3525–3530. IEEE, Bari, Italy (2019)

  3. Bulatov, K., Matalov, D., Arlazarov, V.V.: MIDV-2019: challenges of the modern mobile-based document OCR. In: W. Osten, D.P. Nikolaev (eds.) Twelfth International Conference on Machine Vision (ICMV 2019), vol. 11433, pp. 717 – 722. SPIE (2020), backup Publisher: International Society for Optics and Photonics

  4. Chaudhuri, A., Mandaviya, K., Badelia, P., K Ghosh, S.: Optical Character Recognition Systems for Different Languages with Soft Computing, Studies in Fuzziness and Soft Computing, vol. 352. Springer, Cham (2017)

  5. Chen, T.H.: Do you know your customer? Bank risk assessment based on machine learning. Appl. Soft Comput. 86, 105779 (2020)

    Article  Google Scholar 

  6. Cho, K., van Merriënboer, B., Bahdanau, D., Bengio, Y.: On the properties of neural machine translation: Encoder–decoder approaches. In: Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation, pp. 103–111. Association for Computational Linguistics, Doha, Qatar (2014)

  7. Chollet, F.: Xception: Deep learning with depthwise separable convolutions. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 1800–1807 (2017)

  8. Clausner, C., Antonacopoulos, A., Pletschacher, S.: Efficient and effective OCR engine training. In: International Journal on Document Analysis and Recognition (IJDAR) (2019)

  9. Darwish, S.M., Elzoghaly, K.O.: An enhanced offline printed Arabic OCR model based on bio-inspired fuzzy classifier. IEEE Access 8, 117770–117781 (2020)

    Article  Google Scholar 

  10. Deng, G., Ming, Y., Xue, J.H.: RFRN: A Recurrent Feature Refinement Network for Accurate and Efficient Scene Text Detection. Neurocomputing p. S0925231220317124 (2020)

  11. Doush, I.A., Alkhateeb, F., Gharaibeh, A.H.: A novel Arabic OCR post-processing using rule-based and word context techniques. IJDAR 21(1–2), 77–89 (2018)

    Article  Google Scholar 

  12. Dreuw, P., Heigold, G., Ney, H.: Confidence- and margin-based MMI/MPE discriminative training for off-line handwriting recognition. IJDAR 14(3), 273–288 (2011)

    Article  Google Scholar 

  13. Du, J., Huo, Q.: A discriminative linear regression approach to adaptation of multi-prototype based classifiers and its applications for Chinese OCR. Pattern Recogn. 46(8), 2313–2322 (2013)

    Article  Google Scholar 

  14. Elagouni, K., Garcia, C., Mamalet, F., Sébillot, P.: Text recognition in multimedia documents: a study of two neural-based OCRs using and avoiding character segmentation. IJDAR 17(1), 19–31 (2014)

    Article  Google Scholar 

  15. Fernández-Caballero, A., López, M.T., Castillo, J.C.: Display text segmentation after learning best-fitted OCR binarization parameters. Expert Syst. Appl. 39(4), 4032–4043 (2012)

    Article  Google Scholar 

  16. Graves, A., Fernández, S., Gomez, F., Schmidhuber, J.: Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural ’networks. pp. 369–376 (2006)

  17. Hoai, D.P.V., Hoang, V.T.: Feeding Convolutional Neural Network by hand-crafted features based on Enhanced Neighbor-Center Different Image for color texture classification. In: 2019 International Conference on Multimedia Analysis and Pattern Recognition (MAPR), pp. 1–6 (2019)

  18. Hoai, D.P.V., Surinwarangkoon, T., Hoang, V.T., Duong, H.T., Meethongjan, K.: A comparative study of rice variety classification based on deep learning and hand-crafted features p. 10 (2020)

  19. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–1780 (1997)

    Article  Google Scholar 

  20. Holanda, G.B., Souza, J.W.M., Lima, D.A., Marinho, L.B., Girão, A.M., Bezerra Frota, J.B., Rebouças Filho, P.P.: Development of OCR system on android platforms to aid reading with a refreshable braille display in real time. Measurement 120, 150–168 (2018)

  21. Khosravi, P., Kazemi, E., Zhan, Q., Malmsten, J.E., Toschi, M., Zisimopoulos, P., Sigaras, A., Lavery, S., Cooper, L.A.D., Hickman, C., Meseguer, M., Rosenwaks, Z., Elemento, O., Zaninovic, N., Hajirasouliha, I.: Deep learning enables robust assessment and selection of human blastocysts after in vitro fertilization. npj Digital Medicine 2(1), 21 (2019)

  22. Kingma, D., Ba, J.: Adam: A method for stochastic optimization. In: International Conference on Learning Representations (2014)

  23. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017)

    Article  Google Scholar 

  24. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017)

    Article  Google Scholar 

  25. LeCun, Y., Boser, B., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W., Jackel, L.D.: Backpropagation Applied to Handwritten Zip Code Recognition. Neural Comput. 1(4), 541–551 (1989)

    Article  Google Scholar 

  26. Lee, Y., Song, J., Won, Y.: Improving personal information detection using OCR feature recognition rate. J. Supercomput. 75(4), 1941–1952 (2019)

    Article  Google Scholar 

  27. Levenshtein, V.I.: Binary codes capable of correcting deletions insertions and reversals. Soviet Phys. Doklady 10, 707 (1966)

    MathSciNet  Google Scholar 

  28. Liao, M., Shi, B., Bai, X.: Textboxes++: a single-shot oriented scene text detector. IEEE Trans. Image Process. 27(8), 3676–3690 (2018)

    Article  MathSciNet  Google Scholar 

  29. Liem, H.D., Minh, N.D., Trung, N.B., Duc, H.T., Hiep, P.H., Dung, D.V., Vu, D.H.: FVI: An End-to-end Vietnamese Identification Card Detection and Recognition in Images. In: 2018 5th NAFOSTED Conference on Information and Computer Science (NICS), pp. 338–340. IEEE, Ho Chi Minh City (2018)

  30. Lin, H., Yang, P., Zhang, F.: Review of scene text detection and recognition. Arch. Comput. Methods Eng. 27(2), 433–454 (2020)

    Article  Google Scholar 

  31. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: Ssd: Single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) Computer Vision - ECCV 2016, pp. 21–37. Springer, Cham (2016)

  32. Liu, X., Liang, D., Yan, S., Chen, D., Qiao, Y., Yan, J.: Fots: Fast oriented text spotting with a unified network. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition pp. 5676–5685 (2018)

  33. Mei, J., Islam, A., Moh’d, A., Wu, Y., Milios, E.: Statistical learning for OCR error correction. Inf. Process. Manag. 54(6), 874–887 (2018)

    Article  Google Scholar 

  34. Oni, O.J., Asahiah, F.O.: Computational modelling of an optical character recognition system for Yorùbá printed text images. Sci. Afr. 9, e00415 (2020)

    Google Scholar 

  35. Piroonsup, N., Sinthupinyo, S.: Semi-supervised cluster-and-label with feature based re-clustering to reduce noise in Thai document images. Knowl.-Based Syst. 90, 58–69 (2015)

    Article  Google Scholar 

  36. Pramanik, R., Bag, S.: Shape decomposition-based handwritten compound character recognition for Bangla OCR. J. Vis. Commun. Image Represent. 50, 123–134 (2018)

    Article  Google Scholar 

  37. Pratama, M.O., Satyawan, W., Fajar, B., Fikri, R., Hamzah, H.: Indonesian ID Card Recognition using Convolutional Neural Networks. In: 2018 5th International Conference on Electrical Engineering. Computer Science and Informatics (EECSI), pp. 178–181. IEEE, Malang, Indonesia (2018)

  38. Puigcerver, J.: Are multidimensional recurrent layers really necessary for handwritten text recognition? In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 01, pp. 67–72 (2017)

  39. Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) Medical Image Computing and Computer-Assisted Intervention - MICCAI 2015, pp. 234–241. Springer, Cham (2015)

  40. Shi, B., Bai, X., Yao, C.: An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans. Pattern Anal. Mach. Intell. 39(11), 2298–2304 (2017)

    Article  Google Scholar 

  41. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations (2015)

  42. Tian, Z., Huang, W., He, T., He, P., Qiao, Y.: Detecting text in natural image with connectionist text proposal network. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) Computer Vision - ECCV 2016, pp. 56–72. Springer, Cham (2016)

  43. Yin, Y., Zhang, W., Hong, S., Yang, J., Xiong, J., Gui, G.: Deep learning-aided OCR techniques for chinese uppercase characters in the application of internet of things. IEEE Access 7, 47043–47049 (2019)

    Article  Google Scholar 

  44. Zhou, X., Yao, C., Wen, H., Wang, Y., Zhou, S., He, W., Liang, J.: East: An efficient and accurate scene text detector. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 2642–2651 (2017)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vinh Truong Hoang.

Ethics declarations

Conflicts of interest

This work was supported by Ho Chi Minh City Open University, Vietnam Conflict of Interest: We have no conflict of interest to declare.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Van Hoai, D.P., Duong, HT. & Hoang, V.T. Text recognition for Vietnamese identity card based on deep features network. IJDAR 24, 123–131 (2021). https://doi.org/10.1007/s10032-021-00363-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10032-021-00363-7

Keywords

Navigation