Skip to main content
Log in

Performance Analysis of State of the Art Convolutional Neural Network Architectures in Bangla Handwritten Character Recognition

  • APPLIED PROBLEMS
  • Published:
Pattern Recognition and Image Analysis Aims and scope Submit manuscript

Abstract

Bangla handwritten character recognition is a popular research topic as its difficulty is higher than the recognition of other languages because of multiple formats of compound characters. State of the art Convolutional neural network (CNN) architectures are very much useful in computer vision applications. Some works have been carried out in Bangla handwritten character recognition but most of them either not very efficient or they can not classify a lot of characters. In this work, state of art pre-trained CNN architectures is used to classify 231 different Bangla handwritten characters using CMATERdb dataset. The images were first converted to B&W form with white as the foreground color. The size of the images is reduced to 28 × 28 form. These images are used as input to the CNN architectures. The weights of the state-of-the-art CNN models are kept as it was. The training learning rate was set to 0.001 and categorical cross-entropy as the error function. After 50 epochs, InceptionResNetV2 achieved the best accuracy (96.99%). DenseNet121 and InceptionNetV3 also provided remarkable recognition accuracy (96.55 and 96.20%, respectively). We also considered combination of trained InceptionResNetV2, InceptionNetV3 and DenseNet121 architectures which provided better recognition accuracy (97.69%) than other single CNN architectures but it is not feasible for using as it requires a lot of computation power and memory. The models were tested in the cases where characters look confusing to humans, but all the architectures showed equal capability in recognizing these images. Considering computational complexity, memory and capability of recognizing confused characters, InceptionResNetV2 can be said as the best performing model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.
Fig. 5.
Fig. 6.
Fig. 7.
Fig. 8.
Fig. 9.
Fig. 10.
Fig. 11.
Fig. 12.
Fig. 13.
Fig. 14.

Similar content being viewed by others

REFERENCES

  1. C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015).

  2. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016).

  3. K. He, X. Zhang, S. Ren, and J. Sun, “Identity mappings in deep residual networks,” in European Conference on Computer Vision(ECCV) (2016).

  4. G. Huang, Z. Liu, L. V. D. Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017).

  5. K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition” (2014). arXiv:1409.1556 [cs.CV]

  6. M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L.-C. Chen, “MobileNetV2: Inverted residuals and linear bottlenecks,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (2018).

  7. C. Szegedy, S. Ioffe, V. Vanhoucke, and A. A. Alemi, “InceptionV4, InceptionResNet, and the impact of residual connections on learning,” in 31st AAAI Conf. Artif. Intell. (AAAI 2017) (2017), pp. 4278–4284.

  8. M. Tan and Q. V. Le, “EfficientNet: Rethinking model scaling for convolutional neural networks,” in 36th Int. Conf. Mach. Learn. (ICML 2019) (2019), pp. 10691–10700.

  9. B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le, “Learning transferable architectures for scalable image recognition,” in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit. (2018), pp. 8697–8710.

    Google Scholar 

  10. N. Das, K. Acharya, R. Sarkar, S. Basu, M. Kundu, and M. Nasipuri, “A benchmark data base of isolated Bangla handwritten compound characters,” IJDAR 17, 413–431 (2014).

    Article  Google Scholar 

  11. N. Das, B. Das, R. Sarkar, S. Basu, M. Kundu, and M. Nasipuri, “Handwritten Bangla basic and compound character recognition using MLP and SVM classifier,” J. Comput. 2 (2), 109–115 (2010).

    Google Scholar 

  12. R. Sarkhel, A. K. Saha, and N. Das, “An enhanced harmony search method for Bangla handwritten character recognition using region sampling,” in Proc. 2015 IEEE 2nd Int. Conf. Recent Trends Inf. Syst. (ReTIS 2015) (2015), pp. 325–330.

  13. R. Pramanik and S. Bag, “Shape decomposition-based handwritten compound character recognition for Bangla OCR,” J. Vis. Commun. Image Represent. 50, 123–134 (2018).

    Article  Google Scholar 

  14. N. Das et al., “Recognition of handwritten Bangla basic characters and digits using convex hull-based feature set,” in Int. Conf. Artif. Intell. Pattern Recognit. 2009 (AIPR 2009) (2009), pp. 380–386.

  15. N. Das, S. Basu, R. Sarkar, M. Kundu, M. Nasipuri, and D. Kumar Basu, “An improved feature descriptor for recognition of handwritten Bangla alphabet” (2015). arXiv:1501.05497 [cs.CV]

  16. S. Basu, N. Das, R. Sarkar, M. Kundu, M. Nasipuri, and D. K. Basu, “A hierarchical approach to recognition of handwritten Bangla characters,” Pattern Recognit. 42 (7), 1467–1484 (2009).

    Article  Google Scholar 

  17. T. Bhowmik, P. Ghanty, A. Roy, and S. Parui, “SVM-based hierarchical architectures for handwritten Bangla character recognition,” Doc. Anal. Recognit. 12, 97–108 (2009).

    Article  Google Scholar 

  18. S. K. Parui, K. Guin, U. Bhattacharya, and B. B. Chaudhuri, “Online handwritten Bangla character recognition using HMM,” in 2008 19th International Conference on Pattern Recognition (2008), pp. 1–4.

  19. K. Roy, “Stroke-database design for online handwriting recognition in Bangla,” Int. J. Mod. Eng. Res. 2 (4), 2534–2540 (2012).

    Google Scholar 

  20. M. M. R. Sazal, S. K. Biswas, M. F. Amin, and K. Murase, “Bangla handwritten character recognition using deep belief network,” in 2013 Int. Conf. Electr. Inf. Commun. Technol. (EICT 2013) (2013), pp. 1–5.

  21. S. Roy, N. Das, M. Kundu, and M. Nasipuri, “Handwritten isolated Bangla compound character recognition: A new benchmark using a novel deep learning approach,” Pattern Recognit. Lett. 90, 15–21 (2017).

    Article  Google Scholar 

  22. Ashiquzzaman, A. K. Tushar, S. Dutta, and F. Mohsin, “An efficient method for improving classification accuracy of handwritten Bangla compound characters using DCNN with dropout and ELU,” in Proc. 2017 3rd IEEE Int. Conf. Res. Comput. Intell. Commun. Networks (ICRCICN 2017) (2017), pp. 147–152.

  23. A. Fardous and S. Afroge, “Handwritten isolated Bangla compound character recognition,” in 2019 International Conference on Electrical, Computer and Communication Engineering (ECCE) (Cox’s Bazar, Bangladesh, 2019), pp. 1–5.

  24. S. Saha and N. Saha, “A lightning fast approach to classify Bangla handwritten characters and numerals using newly structured deep neural network,” Procedia Comput. Sci. 132, 1760–1770 (2018).

    Article  Google Scholar 

  25. A. K. M. S. Azad Rabby, S. Haque, S. Abujar, and S. A. Hossain, “Ekushnet: Using convolutional neural network for Bangla handwritten recognition,” Procedia Comput. Sci. 143, 603–610 (2018).

    Article  Google Scholar 

  26. M. A. R. Alif, S. Ahmed, and M. A. Hasan, “Isolated Bangla handwritten character recognition with convolutional neural network,” in 20th Int. Conf. Comput. Inf. Technol. (ICCIT 2017) (2018), pp. 1–6.

  27. T. Ghosh et al., “Bangla handwritten character recognition using MobileNet V1 architecture,” Bull. Electr. Eng. Inf. 9 (6), 2547–2554 (2020).

    Google Scholar 

  28. T. Ghosh, S. M. Chowdhury, M. A. Yousuf, et al., “A comprehensive review on recognition techniques for Bangla handwritten characters,” in 2019 International Conference on Bangla Speech and Language Processing (ICBSLP) (2019), pp. 1–6.

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Tapotosh Ghosh, Min-Ha-Zul Abedin, Hasan Al Banna, Nasirul Mumenin or Mohammad Abu Yousuf.

Ethics declarations

The authors declare that they have no conflicts of interest.

Additional information

Tapotosh Ghosh was born in Dhaka, Bangladesh on November 3, 1998. He received his B.Sc. degree in Information and Communication Technology from Bangladesh University of Professionals, Dhaka, in 2020. Currently, he is persuading a master’s degree from Bangladesh University of Professionals. He has published a conference papers and a journal paper on the field of handwritten character recognition and currently working in several computer vision-related research works. His research interest includes the application of deep learning, machine learning, and natural language processing.

Md. Min-Ha-Zul Abedin was born in Kushtia, Bangladesh on August 13, 1997. He received B.Sc. in Information and Communication Technology from Bangladesh University of Professionals in 2020. He is now working as Lecturer at the Department of Information and Communication Engineering, Bangladesh Army University of Engineering and Technology, Bangladesh. He has published a conference paper and a journal paper on the field of handwritten character recognition and currently working in several computer vision-related research works. His research interests include computer vision, pattern recognition, and artificial intelligence.

Md. Hasan Al Banna was born in Dhaka, Bangladesh in 1997. He received his B.Sc. degree in Information and Communication Technology from Bangladesh University of Professionals, Dhaka, in 2019. Currently, he is persuading a master’s degree from Bangladesh University of Professionals and working as a teaching assistant in the same university. He has published a conference paper on camera model identification and currently working on earthquake prediction. His research interest includes the application of artificial intelligence and machine learning. He was awarded a fellowship from Bangladesh ICT division for his master’s thesis.

Nasirul Mumenin was born in Dhaka, Bangladesh on April 12, 1997. Currently, he is persuading his B.Sc. degree in Information and Communication Technology from Bangladesh University of Professionals, Dhaka, in 2020. His research interest includes artificial intelligence, machine learning, NLP, and IOT. He is a member of academies, scientific societies, and editorial boards and journals of BUP IEEE student branch.

Dr. Mohammad Abu Yousuf received the B.Sc. (Engineering) degree in Computer Science and Engineering from Shahjalal University of Science and Technology, Sylhet, Bangladesh in 1999, the Master of Engineering degree in Biomedical Engineering from Kyung Hee University, South Korea in 2009, and the PhD degree in Science and Engineering from Saitama University, Japan in 2013. In 2003, he joined as a Lecturer in the Department of Computer Science and Engineering, Mawlana Bhashani Science and Technology University, Tangail, Bangladesh. In 2014, he moved to the Institute of Information Technology, Jahangirnagar University. He is now working as Professor at the Institute of Information Technology, Jahangirnagar University, Savar, Dhaka, Bangladesh. His research interests include medical image processing, human-robot interaction, and computer vision.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tapotosh Ghosh, Abedin, MHZ., Al Banna, H. et al. Performance Analysis of State of the Art Convolutional Neural Network Architectures in Bangla Handwritten Character Recognition. Pattern Recognit. Image Anal. 31, 60–71 (2021). https://doi.org/10.1134/S1054661821010089

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1134/S1054661821010089

Keywords:

Navigation