Method for Reduction of Speech Signal Autoregression Model for Speech Transmission Systems on Low-Speed Communication Channels

Savchenko, V. V.

doi:10.3103/S0735272721110030

Method for Reduction of Speech Signal Autoregression Model for Speech Transmission Systems on Low-Speed Communication Channels

Published: 28 January 2022

Volume 64, pages 592–603, (2021)
Cite this article

Radioelectronics and Communications Systems Aims and scope Submit manuscript

V. V. Savchenko ORCID: orcid.org/0000-0003-3045-3337¹

42 Accesses
10 Citations
Explore all metrics

Abstract

In this paper it is considered the problem of reduction or reduction of the order p ≫ 1 of an autoregressive model (AR-model) of a speech signal by the criterion of minimum loss of useful information. The problem is formulated as an optimization problem in terms of discrete spectral modeling. It is indicated that the most acute problem in solving is the necessity to scale the AR-model parameters for the simulated signal at each step of iterative calculation process. To overcome this problem, it is proposed to use the measure of information divergence of signals in the frequency domain with the property of scale invariance as the goal functional. On its basis, a new method of the AR-model reduction is developed where the scaling operation exceeds the limits of the iterative optimization procedure. The effectiveness of the proposed method is substantiated theoretically and researched experimentally. It is shown that the main component of the achieved effect is the gain in accuracy of the reduced AR-model in the Kullback–Leibler information metric. The results obtained are addressed to researchers and developers of systems and technologies for digital speech transmission over low-speed communication channels.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Speech Signal Autoregression Modeling Based on the Discrete Fourier Transform and Scale-Invariant Measure of Information Discrimination

Article 16 November 2021

Improving the Method for Measuring the Accuracy Indicator of a Speech Signal Autoregression Model

Article 01 January 2023

Method of Noise-Robust Estimation of Parameters of an Autoregressive Model in the Frequency Domain

Article 29 September 2021

Notes

https//www.itu.int/rec/T-REC-G/en.
https//dic.academic.ru/dic.nsf/ruwiki/614146.
GOST R 50840-95 "Speech transmission over communication channels. Methods for assessing quality, legibility and recognition".
GOST R 51061-97 Systems of low-speed speech transmission over digital channels. Speech quality options and measurement methods.
Taking into account the features of the acoustic mechanism of speech production [1] and AR-model (1), in this case, the possibility of equality to zero of both the PSD G(F) and its estimate \( \hat{G}(f) \) are excluded from consideration throughout their definition area.
An increase in the convergence rate of iterations (9) by increasing the step γ₀ is accompanied [24] by the inevitable decrease in the accuracy of the vector of final approximations b_p+1(L).
https://sites.google.com/site/frompldcreators/produkty-1/phonemetraining.
http://www.itu.int/rec/T-REC-G.728-201206-I/en.

References

G. Kitagawa, Introduction to Time Series Modeling (Chapman and Hall/CRC, 2020). DOI: https://doi.org/10.1201/9780429197963.
Book Google Scholar
L. Tan, J. Jiang, "Introduction to digital signal processing," in Digital Signal Processing (Elsevier, 2019). DOI: https://doi.org/10.1016/B978-0-12-815071-9.00001-4.
Chapter Google Scholar
L. R. Rabiner, R. W. Schafer, "Introduction to digital speech processing," Found. Trends® Signal Process., v.1, n.1–2, p.1 (2007). DOI: https://doi.org/10.1561/2000000001.
Article MATH Google Scholar
M. W. Spratling, "A review of predictive coding algorithms," Brain Cogn., v.112, p.92 (2017). DOI: https://doi.org/10.1016/j.bandc.2015.11.003.
Article Google Scholar
G. Sharma, K. Umapathy, S. Krishnan, "Trends in audio signal feature extraction methods," Appl. Acoust., v.158, p.107020 (2020). DOI: https://doi.org/10.1016/j.apacoust.2019.107020.
Article Google Scholar
H. Chaouch, F. Merazka, P. Marthon, "Multiple description coding technique to improve the robustness of ACELP based coders AMR-WB," Speech Commun., v.108, p.33 (2019). DOI: https://doi.org/10.1016/j.specom.2019.02.002.
Article Google Scholar
V. V. Savchenko, A. V. Savchenko, "Method for measuring distortions in speech signals during transmission over a communication channel to a biometric identification system," Meas. Tech., v.63, n.11, p.917 (2021). DOI: https://doi.org/10.1007/s11018-021-01864-x.
Article Google Scholar
Y. Gu, H.-L. Wei, "A robust model structure selection method for small sample size and multiple datasets problems," Inf. Sci., v.451–452, p.195 (2018). DOI: https://doi.org/10.1016/j.ins.2018.04.007.
Article MATH Google Scholar
S. Cui, E. Li, X. Kang, "Autoregressive model based smoothing Forensics of very short speech clips," in 2020 IEEE International Conference on Multimedia and Expo (ICME) (IEEE, 2020). DOI: https://doi.org/10.1109/ICME46284.2020.9102765.
Chapter Google Scholar
S. L. Marple, Digital Spectral Analysis with Applications (Dover Publications, Mineola, New York, 2019). URI: https://www.goodreads.com/book/show/19484239.
Google Scholar
J. Benesty, J. Chen, Y. Huang, "Linear prediction," in Springer Handbook of Speech Processing (Springer Berlin Heidelberg, Berlin, Heidelberg, 2008). DOI: https://doi.org/10.1007/978-3-540-49127-9_7.
Chapter Google Scholar
J. Gibson, "Mutual information, the linear prediction model, and CELP voice codecs," Information, v.10, n.5, p.179 (2019). DOI: https://doi.org/10.3390/info10050179.
Article Google Scholar
Ç. Candan, "Making linear prediction perform like maximum likelihood in Gaussian autoregressive model parameter estimation," Signal Process., v.166, p.107256 (2020). DOI: https://doi.org/10.1016/j.sigpro.2019.107256.
Article Google Scholar
D. Xiao, F. Mo, Y. Zhang, M. Zhao, L. Ma, "An extended Levinson-Durbin algorithm and its application in mixed excitation linear prediction," Heliyon, v.4, n.11, p.e00948 (2018). DOI: https://doi.org/10.1016/j.heliyon.2018.e00948.
Article Google Scholar
M. Morise, "CheapTrick, a spectral envelope estimator for high-quality speech synthesis," Speech Commun., v.67, p.1 (2015). DOI: https://doi.org/10.1016/j.specom.2014.09.003.
Article Google Scholar
V. Y. Semenov, "Methods for calculating and coding the parameters of autoregressive speech model when developing the vocoder based on fixed point signal process," J. Autom. Inf. Sci., v.51, n.2, p.30 (2019). DOI: https://doi.org/10.1615/JAutomatInfScien.v51.i2.40.
Article Google Scholar
V. V. Savchenko, A. V. Savchenko, "Guaranteed significance level criterion in automatic speech signal segmentation," J. Commun. Technol. Electron., v.65, n.11, p.1311 (2020). DOI: https://doi.org/10.1134/S1064226920110157.
Article Google Scholar
A. V. Savchenko, V. V. Savchenko, "A method for measuring the pitch frequency of speech signals for the systems of acoustic speech analysis," Meas. Tech., v.62, n.3, p.282 (2019). DOI: https://doi.org/10.1007/s11018-019-01617-x.
Article Google Scholar
C. Liu, M. Jiang, "Robust adaptive filter with lncosh cost," Signal Process., v.168, p.107348 (2020). DOI: https://doi.org/10.1016/j.sigpro.2019.107348.
Article Google Scholar
S. Kullback, Information Theory and Statistics (Dover Publications, New York, 1997). URI: https://www.amazon.com/Information-Theory-Statistics-Dover-Mathematics/dp/0486696847.
MATH Google Scholar
V. V. Savchenko, A. V. Savchenko, "Criterion of significance level for selection of order of spectral estimation of entropy maximum," Radioelectron. Commun. Syst., v.62, n.5, p.223 (2019). DOI: https://doi.org/10.3103/S0735272719050042.
Article Google Scholar
V. V. Savchenko, L. V. Savchenko, "Speech signal autoregression modeling based on the discrete Fourier transform and scale-invariant measure of information discrimination," J. Commun. Technol. Electron., v.66, n.11, p.1266 (2021). DOI: https://doi.org/10.1134/S1064226921110085.
Article Google Scholar
F. Mustiere, M. Bouchard, M. Bolic, "All-pole modeling of discrete spectral powers: A unified approach," IEEE Trans. Audio, Speech, Lang. Process., v.20, n.2, p.705 (2012). DOI: https://doi.org/10.1109/TASL.2011.2163511.
Article Google Scholar
A. R. Sampson, "Stochastic Approximation," in Wiley StatsRef: Statistics Reference Online (Wiley, 2014). DOI: https://doi.org/10.1002/9781118445112.stat01848.
Chapter Google Scholar
V. V. Savchenko, "Minimum of information divergence criterion for signals with tuning to speaker voice in automatic speech recognition," Radioelectron. Commun. Syst., v.63, n.1, p.42 (2020). DOI: https://doi.org/10.3103/S0735272720010045.
Article Google Scholar
A. V. Savchenko, V. V. Savchenko, "Scale-invariant modification of COSH distance for measuring speech signal distortions in real-time mode," Radioelectron. Commun. Syst., v.64, n.6, p.300 (2021). DOI: https://doi.org/10.3103/S0735272721060030.
Article Google Scholar
V. V. Savchenko, "Itakura–Saito divergence as an element of the information theory of speech perception," J. Commun. Technol. Electron., v.64, n.6, p.590 (2019). DOI: https://doi.org/10.1134/S1064226919060093.
Article Google Scholar
R. Gray, A. Buzo, A. Gray, Y. Matsuyama, "Distortion measures for speech processing," IEEE Trans. Acoust. Speech, Signal Process., v.28, n.4, p.367 (1980). DOI: https://doi.org/10.1109/TASSP.1980.1163421.
Article MATH Google Scholar
E. Estrada, H. Nazeran, F. Ebrahimi, M. Mikaeili, "Symmetric Itakura distance as an EEG signal feature for sleep depth determination," in ASME 2009 Summer Bioengineering Conference, Parts A and B (American Society of Mechanical Engineers, 2009). DOI: https://doi.org/10.1115/SBC2009-206233.
Chapter Google Scholar
D. Wang, M. Yu, C. B. Low, S. Arogeti, Model-based Health Monitoring of Hybrid Systems (Springer New York, New York, NY, 2013). DOI: https://doi.org/10.1007/978-1-4614-7369-5.
Book Google Scholar
O. Diana, A. Mihaela, "Feature extraction and classification methods for a motor task brain computer interface: A comparative evaluation for two databases," Int. J. Adv. Comput. Sci. Appl., v.8, n.8 (2017). DOI: https://doi.org/10.14569/IJACSA.2017.080834.
Article Google Scholar
H. B. Kashani, A. Sayadiyan, "Sequential use of spectral models to reduce deletion and insertion errors in vowel detection," Comput. Speech Lang., v.50, p.105 (2018). DOI: https://doi.org/10.1016/j.csl.2017.12.008.
Article Google Scholar
J. Gibson, "Speech compression," Information, v.7, n.2, p.32 (2016). DOI: https://doi.org/10.3390/info7020032.
Article Google Scholar
G. Tamulevicius, J. Kaukenas, "High-order autoregressive modeling of individual speaker’s qualities," in 2017 5th IEEE Workshop on Advances in Information, Electronic and Electrical Engineering (AIEEE) (IEEE, 2017). DOI: https://doi.org/10.1109/AIEEE.2017.8270551.
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Linguistic University of Nizhny Novgorod, Nizhny Novgorod, Russian Federation
V. V. Savchenko

Authors

V. V. Savchenko
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to V. V. Savchenko.

Ethics declarations

ADDITIONAL INFORMATION

V.V. Savchenko

The author declares that he has no conflicts of interest.

This article does not contain any studies with human participants or animals performed by any of the authors.

The initial version of this paper in Russian is published in the journal “Izvestiya Vysshikh Uchebnykh Zavedenii. Radioelektronika,” ISSN 2307-6011 (Online), ISSN 0021-3470 (Print) on the link http://radio.kpi.ua/article/view/S0021347021110030 with DOI: https://doi.org/10.20535/S0021347021110030

Additional information

Translated from Izvestiya Vysshikh Uchebnykh Zavedenii. Radioelektronika, No. 11, pp. 682-695, November, 2021 https://doi.org/10.20535/S0021347021110030 .

About this article

Cite this article

Savchenko, V.V. Method for Reduction of Speech Signal Autoregression Model for Speech Transmission Systems on Low-Speed Communication Channels. Radioelectron.Commun.Syst. 64, 592–603 (2021). https://doi.org/10.3103/S0735272721110030

Download citation

Received: 02 August 2021
Revised: 21 November 2021
Accepted: 21 November 2021
Published: 28 January 2022
Issue Date: November 2021
DOI: https://doi.org/10.3103/S0735272721110030

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions