Abstract
The wavelet transform shows a promising solution for a non-stationary signal. The denoising of a noisy speech signal carried with a wavelet thresholding technique. The noisy signal decomposed into different frequency bands, and this decomposition level (DL) decided independent of non-stationary noise. In this letter, a new DL detection procedure presented, and it decides the decomposition level based on signal energy and speech dominance. The proposed DL applies to the speech denoising model, and the obtained results compared with the unversal thresholding technique and the minimum mean square error algorithm. The performance of the enhanced speech signal measures with speech intelligibility measure (STOI), and speech quality measure (PESQ). The experimental results revealed that the proposed scheme outperforms the conventional methods in all SNR and non-stationary environments.
References
Loizou PC (2007) Speech enhancement: theory and practice. CRC Press, Boca Raton
Wiener N (1950) Extrapolation, interpolation, and smoothing of stationary time series: with engineering applications. MIT Press, Cambridge
Ephraim Y, Malah D (1984) Speech enhancement using a minimum-mean square error short-time spectral amplitude estimator. IEEE Trans Acoust Speech Signal Process 32(6):1109–1121
Mallat SG (1989) A theory for multiresolution signal decomposition: the wavelet representation. IEEE Trans Pattern Anal Mach Intell 11(7):674–693
Rafiee J, Tse P, Harifi A, Sadeghi M (2009) A novel technique for selecting mother wavelet function using an intelli gent fault diagnosis system. Expert Syst Appl 36(3):4862–4875
Srivastava M, Anderson CL, Freed JH (2016) A new wavelet denoising method for selecting decomposition levels and noise thresholds. IEEE Access 4:3862–3877
Pradhan PS, King RL, Younan NH, Holcomb DW (2006) Estimation of the number of decomposition levels for a wavelet-based multiresolution multisensor image fusion. IEEE Trans Geosci Remote Sens 44(12):3674–3686
Lei L, Wang C, Liu X (2013) Discrete wavelet transform decomposition level determination exploiting sparseness measurement. Int J Electr Comput Energ Electron Commun Eng 7(9):691–694
Garofolo JS et al (1988) Getting started with the DARPA TIMIT CD-ROM: an acoustic phonetic continuous speech database. Natl Inst Stand Technol Gaithersburgh MD 107:1–6
Varga A, Steeneken HJ (1993) Assessment for automatic speech recognition: II. NOISEX-92: a database and an experiment to study the effect of additive noise on speech recognition systems. Speech Commun 12(3):247–251
Donoho DL, Johnstone JM (1994) Ideal spatial adaptation by wavelet shrinkage. Biometrika 81(3):425–455
Donoho DL (1995) De-noising by soft-thresholding. IEEE Trans Inf Theory 41(3):613–627
Taal CH, Hendriks RC, Heusdens R, Jensen J (2011) An lgorithm for intelligibility prediction of time-frequency weighted noisy speech. IEEE Trans Audio Speech Lang Process 19(7):2125–2136
Hu Y, Loizou PC (2008) Evaluation of objective quality measures for speech enhancement. IEEE Trans Audio Speech Lang Process 16(1):229–238
Acknowledgements
We would like to thank the Department of Electrical Engineering of Indian Institute of Technology Roorkee for funding the High-Performance Computing Workspace, where this research was carried out.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Chiluveru, S.R., Tripathy, M. Speech Enhancement using a Variable Level Decomposition DWT. Natl. Acad. Sci. Lett. 44, 239–242 (2021). https://doi.org/10.1007/s40009-020-00983-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40009-020-00983-3