A Combination of DWT CLAHE and Wiener Filter for Effective Scene to Text Conversion and Pronunciation

Mian Qaisar, Saeed; Hammad, Noofa; Khan, Raviha

doi:10.1007/s42835-020-00461-2

A Combination of DWT CLAHE and Wiener Filter for Effective Scene to Text Conversion and Pronunciation

Original Article
Published: 02 June 2020

Volume 15, pages 1829–1836, (2020)
Cite this article

Journal of Electrical Engineering & Technology Aims and scope Submit manuscript

156 Accesses
3 Citations
Explore all metrics

Abstract

An effective scene to text conversion and its pronunciation is realized. An intelligent combination of Discrete Wavelet Transform (DWT), Contrast Limited Adaptive Histogram Equalization (CLAHE), Wiener filter and adaptive weighted average is utilized for the image enhancement. Subsequently, the Maximally Stable Extremal Region (MSER) is used to detect the text regions. Afterward, the geometrical and contour based approaches filter out the non-text MSERs. The connected component concept is used to group the text candidates. In next step the Optical Character Recognition (OCR) recognizes the text. The Microsoft speech to text synthesizer pronounces the extracted text. The system applicability is tested by using the standard robust reading competition dataset. The designed method secures 93% precision in text segmentation and 89.9% precision in end-to-end recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Mechanism for Detection of Text in Images Using DWT and MSER

Automated Text Detection and Character Recognition in Natural Scenes Based on Local Image Features and Contour Processing Techniques

Implementation of Automatic Detection of Text from Complex Images and Converting to Semantic Speech

References

Rathi A, Nikalje AV (2019) Review on portable camera based assistive text and label reading for blind persons. Int Res J Eng Technol (IRJET) 6(12):879–882
Google Scholar
Khan Z, Braich PS, Rahim K, Rayat JS, Xing L, Iqbal M et al (2016) Burden and depression among caregivers of visually impaired patients in a Canadian population. Adv Med. https://doi.org/10.1155/2016/4683427
Article Google Scholar
Coughlan J, Manduchi R (2018) Camera-based access to visual information. Assistive technology for blindness and low vision. CRC Press, USA, pp 237–264
Google Scholar
Deshpande S, Shriram R (2016) Real time text detection and recognition on hand held objects to assist blind people. In: 2016 International conference on automatic control and dynamic optimization techniques (ICACDOT). IEEE, pp. 1020–1024
Neumann L, Matas J (2013) Scene text localization and recognition with oriented stroke detection. In: 2013 IEEE International Conference on Computer Vision (ICCV). IEEE, pp. 97–104
Weinman JJ, Butler Z, Knoll D, Feild J (2014) Toward integrated scene text reading. IEEE Trans Pattern Anal Mach Intell 36(2):375–387
Article Google Scholar
Satyanarayana P, Sujitha K, Kiron VSA, Reddy PA, Ganesh M (2018) Assistance vision for blind people using k-NN algorithm and Raspberry Pi. In: Proceedings of 2^nd international conference on micro-electronics, electromagnetics and telecommunications, Springer, Singapore, pp. 113–122
Kalbhor MKM, SD MK (2017) A survey on portable camera-based assistive text and product label reading from hand-held objects for blind persons. Int Res J Eng Technol (IRJET) 4(3):55–57
Google Scholar
Laksmi TV, Madhu T, Kavya K, Basha SE (2016) Novel image enhancement technique using CLAHE and wavelet transforms. Int J Sci Eng Technol 5(11):507–511
Google Scholar
Ramaraj M, Raghavan S, Khan WA (2013) Homomorphic filtering techniques for WCE image enhancement. In: 2013 IEEE international conference on computational intelligence and computing research. IEEE, pp 1–5
Makandar A, Halalli B (2016) Pre-processing of mammography image for early detection of breast cancer. Int J Comput Appl 144(3):0975–8887
Google Scholar
Šarić M (2017) Scene text segmentation using low variation extremal regions and sorting based character grouping. Neurocomputing 266:56–65
Article Google Scholar
Evangelopoulos G, Zlatintsi A, Potamianos A, Maragos P, Rapantzikos K, Skoumas G, Avrithis Y (2013) Multimodal saliency and fusion for movie summarization based on aural, visual, and textual attention. IEEE Trans Multimedia 15(7):1553–1568
Article Google Scholar
Yin XC, Yin X, Huang K, Hao HW (2013) Robust text detection in natural scene images. IEEE Trans Pattern Anal Mach Intell 36(5):970–983
Google Scholar
Huang W, Qiao Y, Tang X (2014) Robust scene text detection with convolution neural network induced mser trees. In: European conference on computer vision. Springer, Cham, pp. 497–511
Tian S, Pan Y, Huang C, Lu S, Yu K, Lim Tan C (2015) Text flow: a unified text detection system in natural scene images. In: Proceedings of the IEEE international conference on computer vision, pp. 4651–4659
Liao M, Shi B, Bai X, Wang X, Liu W (2017) Textboxes: a fast text detector with a single deep neural network. In: Thirty-first AAAI conference on artificial intelligence
Këpuska V, Bohouta G (2017) Comparing speech recognition systems (Microsoft API, Google API and CMU Sphinx). Int J Eng Res Appl 7(03):20–24
Google Scholar
Shahab A, Shafait F, Dengel A (2011) ICDAR 2011 robust reading competition challenge 2: reading text in scene images. In 2011 international conference on document analysis and recognition. IEEE, pp. 1491–1496
Lidong H, Wei Z, Jun W, Zebin S (2015) Combination of contrast limited adaptive histogram equalisation and discrete wavelet transform for image enhancement. IET Image Proc 9(10):908–915
Article Google Scholar
Shi C, Wang C, Xiao B, Zhang Y, Gao S (2013) Scene text detection using graph model built upon maximally stable extremal regions. Pattern Recogn Lett 34(2):107–116
Article Google Scholar
Qaisar SM, Khan R, Hammad N (2019) Scene to text conversion and pronunciation for visually impaired people. In: 2019 Advances in science and engineering technology international conferences (ASET), IEEE, pp. 1–4
Karatzas D, Gomez-Bigorda L, Nicolaou A, Ghosh S, Bagdanov A, Iwamura M, et al (2015) ICDAR 2015 competition on robust reading. In: 2015 13th International conference on document analysis and recognition (ICDAR), IEEE, pp. 1156–1160
Xie X, Yue D, Peng C (2020) Observer design of discrete-time fuzzy systems based on an alterable weights method. IEEE Trans Cybern 50(4):1430–1439. https://doi.org/10.1109/TCYB.2018.2878419
Article Google Scholar

Download references

Funding

This project is funded by the Effat University, Jeddah, Saudi Arabia under the grant number UC#9/29 April.2020/7.1-22(2)1. Authors are thankful to anonymous reviewers for their useful feedback

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, College of Engineering, Effat University, 22332, Jeddah, Saudi Arabia
Saeed Mian Qaisar, Noofa Hammad & Raviha Khan

Authors

Saeed Mian Qaisar
View author publications
You can also search for this author in PubMed Google Scholar
Noofa Hammad
View author publications
You can also search for this author in PubMed Google Scholar
Raviha Khan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Saeed Mian Qaisar.

Ethics declarations

Conflict of interest

Authors declare no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mian Qaisar, S., Hammad, N. & Khan, R. A Combination of DWT CLAHE and Wiener Filter for Effective Scene to Text Conversion and Pronunciation. J. Electr. Eng. Technol. 15, 1829–1836 (2020). https://doi.org/10.1007/s42835-020-00461-2

Download citation

Received: 21 January 2020
Revised: 28 April 2020
Accepted: 22 May 2020
Published: 02 June 2020
Issue Date: July 2020
DOI: https://doi.org/10.1007/s42835-020-00461-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Combination of DWT CLAHE and Wiener Filter for Effective Scene to Text Conversion and Pronunciation

Abstract

Access this article

Similar content being viewed by others

A Mechanism for Detection of Text in Images Using DWT and MSER

Automated Text Detection and Character Recognition in Natural Scenes Based on Local Image Features and Contour Processing Techniques

Implementation of Automatic Detection of Text from Complex Images and Converting to Semantic Speech

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Combination of DWT CLAHE and Wiener Filter for Effective Scene to Text Conversion and Pronunciation

Abstract

Access this article

Similar content being viewed by others

A Mechanism for Detection of Text in Images Using DWT and MSER

Automated Text Detection and Character Recognition in Natural Scenes Based on Local Image Features and Contour Processing Techniques

Implementation of Automatic Detection of Text from Complex Images and Converting to Semantic Speech

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation