Skip to main content
Log in

Efficient remote access system based on decoded and decompressed speech signals

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

This paper investigates the effect of both decoding and decompression on the Speaker Identification (SI) in a remote access system. The coding and compression processes are used for the communication purpose as a normal action taken for voice communication over Internet or mobile networks. In the proposed system, the speech signal is coded with the Linear Predictive Coding (LPC) technique. Also, the speech signal is compressed using two techniques. The first technique depends on decimation process to compress the signal. The signal can be recovered using inverse solutions. The inverse solutions include maximum entropy and regularized reconstruction. The second technique is the Compressive Sensing (CS) and the speech signal can be reconstructed using linear programming. The coded or compressed speech signal is transmitted into the receiver via a wireless communication channel. At the receiver, the received signal is decoded or decompressed, and then SI is performed on the decoded or decompressed speech signal. The performance of coding and compression techniques is evaluated using some metrics such as Perceptual Evaluation of Speech Quality (PESQ) and Dynamic Time Warping (DTW). The objective of SI is to achieve the security needed for the remote access system, and this security can be increased using coding and compression processes. In the SI system, the feature vectors are captured from different discrete transforms such as Discrete Wavelet Transform (DWT), Discrete Cosine Transform (DCT), and Discrete Sine Transform (DST), besides the time domain. The recognition rate for all transforms is computed to evaluate the performance of the SI system.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

Similar content being viewed by others

References

  1. Adel H, Zahran O, Taha TE, Al-Nauimy W, El-Halafawy S, El-Rabaie ESM, Abd FE. ECG Signal Compression Using A Proposed Inverse

  2. Ahmed S (2011) Compressive Sensing for Speech Signals in Mobile Systems

  3. Andrews HC, Hunt BR (1977) Digital image restauration

  4. Bachu, R. G., Kopparthi, S., Adapa, B., & Barkana, B. D. (2008) Separation of voiced and unvoiced using zero crossing rate and energy of the speech signal. In: American Society for Engineering Education (ASEE) Zone Conference Proceedings (pp. 1–7)

  5. Baraniuk RG (2007) Compressive sensing. IEEE Signal Process Mag 24:4

    Google Scholar 

  6. Bradbury J (2000) Linear predictive coding. Mc G. Hill, New York

    Google Scholar 

  7. Desai S, Nakrani N (2013) Compressive sensing in speech processing: A survey based on sparsity and sensing matrix. International Journal of Emerging Technology and Advanced Engineering 3(12):18–23

    Google Scholar 

  8. El-Khamy SE, Hadhoud MM, Dessouky MI, Salam BM, El-Samie FEA (2005) Regularized super-resolution reconstruction of images using wavelet fusion. Opt Eng 44(9):097001

    Article  Google Scholar 

  9. El-Samie FEA (2011) Information security for automatic speaker identification. In: Information Security for Automatic Speaker Identification (pp. 1–122). Springer, New York

  10. Ernest GD, Timothy AA, Kpangkpari G (2016) The Use of Remote Access Tools by System Administrators Today and their Effectiveness: Case Study of Remote Desktop, Virtual Network Computing and Secure Android App. Int J Comput Appl 136(10):35–38

    Google Scholar 

  11. Galatsanos NP, Chin RT (1989) Digital restoration of multichannel images. IEEE Trans Acoust Speech Signal Process 37(3):415–421

    Article  Google Scholar 

  12. Hohage T (2002) Lecture notes on inverse problems

  13. Honda M (2003) Human speech production mechanisms. NTT Technical Review 1(2):24–29

    Google Scholar 

  14. Hu Y, Loizou PC (2007) Evaluation of objective quality measures for speech enhancement. IEEE Trans Audio Speech Lang Process 16(1):229–238

    Article  Google Scholar 

  15. Ikram, M., Vallina-Rodriguez, N., Seneviratne, S., Kaafar, M. A., & Paxson, V. (2016). An analysis of the privacy and security risks of android vpn permission-enabled apps. In: Proceedings of the 2016 Internet Measurement Conference (pp. 349–364). ACM.

  16. Jagtap SK, Mulye MS, Uplane MD (2015) Speech coding techniques. Procedia Computer Science 49:253–263

    Article  Google Scholar 

  17. Kabanikhin SI, Shishlenin MA (2019) Theory and numerical methods for solving inverse and ill-posed problems. Journal of Inverse and Ill-posed Problems 27(3):453–456

    Article  Google Scholar 

  18. Liao X, Qin Z, Ding L (2017) Data embedding in digital images using critical functions. Signal Process Image Commun 58:146–156

    Article  Google Scholar 

  19. Liao X, Yu Y, Li B, Li Z, Qin Z (2019) A New Payload Partition Strategy in Color Image Steganography. IEEE Transactions on Circuits and Systems for Video Technology

  20. O'Cinneide A, Dorran D, Gainza M (2008) Linear Prediction: The Problem, its Solution and Application to Speech

  21. Peleg, N. (2009). Linear Prediction Coding. Update, 1.

  22. Sakoe H, Chiba S, Waibel A, Lee KF (1990) Dynamic programming algorithm optimization for spoken word recognition. Readings in Speech Recognition 159:224

    Google Scholar 

  23. Shin JH, Jung JH, Paik JK (1998) Regularized iterative image interpolation and its application to spatially scalable coding. IEEE Trans Consum Electron 44(3):1042–1047

    Article  Google Scholar 

  24. Singh N (2018) Rudimentary of Speaker Recognition. In: Proceedings of (DIAL-2018), National Conference Digital India-Altering Lives, Smart Governance Impact and Implementation Challenges, 8th Dec (pp. 5–9)

  25. Spratling MW (2017) A review of predictive coding algorithms. Brain Cogn 112:92–97

    Article  Google Scholar 

  26. Taylor P (2009) Text-to-speech synthesis. Cambridge University Press, Cambridge

    Book  Google Scholar 

  27. Tirumala SS, Shahamiri SR, Garhwal AS, Wang R (2017) Speaker identification features extraction methods: A systematic review. Expert Syst Appl 90:250–271

    Article  Google Scholar 

  28. Tsilifis P, Huan X, Safta C, Sargsyan K, Lacaze G, Oefelein JC, Ghanem RG (2019) Compressive sensing adaptation for polynomial chaos expansions. J Comput Phys 380:29–47

    Article  MathSciNet  Google Scholar 

  29. Vaidyanathan PP (2007) The theory of linear prediction. Synthesis lectures on signal processing 2(1):1–184

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hala Shawky El-Kfafy.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

El-Kfafy, H.S., Abd-Elnaby, M., Rihan, M. et al. Efficient remote access system based on decoded and decompressed speech signals. Multimed Tools Appl 79, 22293–22324 (2020). https://doi.org/10.1007/s11042-019-08150-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-019-08150-7

Keywords

Navigation