Abstract
Optical character recognition (OCR) is a technology for converting text automatically on images into data strings for editing, indexing, and searching. The strings can be applied for many tasks such as to digitize old documents, translate into other languages, or to test and verify text positions. Recently, Know Your Customer (KYC) has become an industry standard for making sure that people are who they say they are. While the scope of Know Your Customer is constantly expanding, ID verification is still a crucial first step in KYC processes. Mobile OCR is one of the technological solutions that is making this part of KYC easier than ever for customers to comply with. KYC processes require financial services companies to verify the identities of their customers OCR to extract data by reading IDs, bank cards, and documents. In this paper, we investigate to develop a method for Vietnamese identity card recognition based on deep features network. On several major data fields of identity cards, it achieves an accuracy of more than 96.7% and 89.7% on character level and word level, respectively.
Similar content being viewed by others
References
Arafat, S.Y., Iqbal, M.J.: Urdu-text detection and recognition in natural scene images using deep learning. IEEE Access 8, 96787–96803 (2020)
Attivissimo, F., Giaquinto, N., Scarpetta, M., Spadavecchia, M.: An Automatic Reader of Identity Documents. In: 2019 IEEE International Conference on Systems. Man and Cybernetics (SMC), pp. 3525–3530. IEEE, Bari, Italy (2019)
Bulatov, K., Matalov, D., Arlazarov, V.V.: MIDV-2019: challenges of the modern mobile-based document OCR. In: W. Osten, D.P. Nikolaev (eds.) Twelfth International Conference on Machine Vision (ICMV 2019), vol. 11433, pp. 717 – 722. SPIE (2020), backup Publisher: International Society for Optics and Photonics
Chaudhuri, A., Mandaviya, K., Badelia, P., K Ghosh, S.: Optical Character Recognition Systems for Different Languages with Soft Computing, Studies in Fuzziness and Soft Computing, vol. 352. Springer, Cham (2017)
Chen, T.H.: Do you know your customer? Bank risk assessment based on machine learning. Appl. Soft Comput. 86, 105779 (2020)
Cho, K., van Merriënboer, B., Bahdanau, D., Bengio, Y.: On the properties of neural machine translation: Encoder–decoder approaches. In: Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation, pp. 103–111. Association for Computational Linguistics, Doha, Qatar (2014)
Chollet, F.: Xception: Deep learning with depthwise separable convolutions. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 1800–1807 (2017)
Clausner, C., Antonacopoulos, A., Pletschacher, S.: Efficient and effective OCR engine training. In: International Journal on Document Analysis and Recognition (IJDAR) (2019)
Darwish, S.M., Elzoghaly, K.O.: An enhanced offline printed Arabic OCR model based on bio-inspired fuzzy classifier. IEEE Access 8, 117770–117781 (2020)
Deng, G., Ming, Y., Xue, J.H.: RFRN: A Recurrent Feature Refinement Network for Accurate and Efficient Scene Text Detection. Neurocomputing p. S0925231220317124 (2020)
Doush, I.A., Alkhateeb, F., Gharaibeh, A.H.: A novel Arabic OCR post-processing using rule-based and word context techniques. IJDAR 21(1–2), 77–89 (2018)
Dreuw, P., Heigold, G., Ney, H.: Confidence- and margin-based MMI/MPE discriminative training for off-line handwriting recognition. IJDAR 14(3), 273–288 (2011)
Du, J., Huo, Q.: A discriminative linear regression approach to adaptation of multi-prototype based classifiers and its applications for Chinese OCR. Pattern Recogn. 46(8), 2313–2322 (2013)
Elagouni, K., Garcia, C., Mamalet, F., Sébillot, P.: Text recognition in multimedia documents: a study of two neural-based OCRs using and avoiding character segmentation. IJDAR 17(1), 19–31 (2014)
Fernández-Caballero, A., López, M.T., Castillo, J.C.: Display text segmentation after learning best-fitted OCR binarization parameters. Expert Syst. Appl. 39(4), 4032–4043 (2012)
Graves, A., Fernández, S., Gomez, F., Schmidhuber, J.: Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural ’networks. pp. 369–376 (2006)
Hoai, D.P.V., Hoang, V.T.: Feeding Convolutional Neural Network by hand-crafted features based on Enhanced Neighbor-Center Different Image for color texture classification. In: 2019 International Conference on Multimedia Analysis and Pattern Recognition (MAPR), pp. 1–6 (2019)
Hoai, D.P.V., Surinwarangkoon, T., Hoang, V.T., Duong, H.T., Meethongjan, K.: A comparative study of rice variety classification based on deep learning and hand-crafted features p. 10 (2020)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–1780 (1997)
Holanda, G.B., Souza, J.W.M., Lima, D.A., Marinho, L.B., Girão, A.M., Bezerra Frota, J.B., Rebouças Filho, P.P.: Development of OCR system on android platforms to aid reading with a refreshable braille display in real time. Measurement 120, 150–168 (2018)
Khosravi, P., Kazemi, E., Zhan, Q., Malmsten, J.E., Toschi, M., Zisimopoulos, P., Sigaras, A., Lavery, S., Cooper, L.A.D., Hickman, C., Meseguer, M., Rosenwaks, Z., Elemento, O., Zaninovic, N., Hajirasouliha, I.: Deep learning enables robust assessment and selection of human blastocysts after in vitro fertilization. npj Digital Medicine 2(1), 21 (2019)
Kingma, D., Ba, J.: Adam: A method for stochastic optimization. In: International Conference on Learning Representations (2014)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Commun. ACM 60(6), 84–90 (2017)
LeCun, Y., Boser, B., Denker, J.S., Henderson, D., Howard, R.E., Hubbard, W., Jackel, L.D.: Backpropagation Applied to Handwritten Zip Code Recognition. Neural Comput. 1(4), 541–551 (1989)
Lee, Y., Song, J., Won, Y.: Improving personal information detection using OCR feature recognition rate. J. Supercomput. 75(4), 1941–1952 (2019)
Levenshtein, V.I.: Binary codes capable of correcting deletions insertions and reversals. Soviet Phys. Doklady 10, 707 (1966)
Liao, M., Shi, B., Bai, X.: Textboxes++: a single-shot oriented scene text detector. IEEE Trans. Image Process. 27(8), 3676–3690 (2018)
Liem, H.D., Minh, N.D., Trung, N.B., Duc, H.T., Hiep, P.H., Dung, D.V., Vu, D.H.: FVI: An End-to-end Vietnamese Identification Card Detection and Recognition in Images. In: 2018 5th NAFOSTED Conference on Information and Computer Science (NICS), pp. 338–340. IEEE, Ho Chi Minh City (2018)
Lin, H., Yang, P., Zhang, F.: Review of scene text detection and recognition. Arch. Comput. Methods Eng. 27(2), 433–454 (2020)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: Ssd: Single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) Computer Vision - ECCV 2016, pp. 21–37. Springer, Cham (2016)
Liu, X., Liang, D., Yan, S., Chen, D., Qiao, Y., Yan, J.: Fots: Fast oriented text spotting with a unified network. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition pp. 5676–5685 (2018)
Mei, J., Islam, A., Moh’d, A., Wu, Y., Milios, E.: Statistical learning for OCR error correction. Inf. Process. Manag. 54(6), 874–887 (2018)
Oni, O.J., Asahiah, F.O.: Computational modelling of an optical character recognition system for Yorùbá printed text images. Sci. Afr. 9, e00415 (2020)
Piroonsup, N., Sinthupinyo, S.: Semi-supervised cluster-and-label with feature based re-clustering to reduce noise in Thai document images. Knowl.-Based Syst. 90, 58–69 (2015)
Pramanik, R., Bag, S.: Shape decomposition-based handwritten compound character recognition for Bangla OCR. J. Vis. Commun. Image Represent. 50, 123–134 (2018)
Pratama, M.O., Satyawan, W., Fajar, B., Fikri, R., Hamzah, H.: Indonesian ID Card Recognition using Convolutional Neural Networks. In: 2018 5th International Conference on Electrical Engineering. Computer Science and Informatics (EECSI), pp. 178–181. IEEE, Malang, Indonesia (2018)
Puigcerver, J.: Are multidimensional recurrent layers really necessary for handwritten text recognition? In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 01, pp. 67–72 (2017)
Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) Medical Image Computing and Computer-Assisted Intervention - MICCAI 2015, pp. 234–241. Springer, Cham (2015)
Shi, B., Bai, X., Yao, C.: An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE Trans. Pattern Anal. Mach. Intell. 39(11), 2298–2304 (2017)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations (2015)
Tian, Z., Huang, W., He, T., He, P., Qiao, Y.: Detecting text in natural image with connectionist text proposal network. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) Computer Vision - ECCV 2016, pp. 56–72. Springer, Cham (2016)
Yin, Y., Zhang, W., Hong, S., Yang, J., Xiong, J., Gui, G.: Deep learning-aided OCR techniques for chinese uppercase characters in the application of internet of things. IEEE Access 7, 47043–47049 (2019)
Zhou, X., Yao, C., Wen, H., Wang, Y., Zhou, S., He, W., Liang, J.: East: An efficient and accurate scene text detector. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 2642–2651 (2017)
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
This work was supported by Ho Chi Minh City Open University, Vietnam Conflict of Interest: We have no conflict of interest to declare.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Van Hoai, D.P., Duong, HT. & Hoang, V.T. Text recognition for Vietnamese identity card based on deep features network. IJDAR 24, 123–131 (2021). https://doi.org/10.1007/s10032-021-00363-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10032-021-00363-7