Skip to main content
Log in

Distributed Speech Presence Probability Estimator in Fully Connected Wireless Acoustic Sensor Networks

  • Published:
Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Abstract

This paper presents a Gaussian-based distributed speech presence probability (DSPP) estimator which is applied in fully connected wireless acoustic sensor networks (WASNs). In WASNs, we are primarily interested in optimally utilizing all available information of recorded signals. In this work, under the Gaussian statistical assumption of signals, each node computes the DSPP using its own local signals along with the compressed signals from other nodes. We evaluate the effect of DSPP estimation on noise reduction from both the simulated and the real recorded signals. The performance of the proposed DSPP estimator is compared to that of local SPP estimation, where each node only uses its noisy signals, and to that of centralized SPP estimation, where each node uses all recorded noisy signals of the whole network. It is shown that the proposed method exhibits good performance, while the computational complexity is considerably reduced.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Notes

  1. The ratio of clean signal variance to noise signal variance.

  2. A priori knowledge, indicating whether speech segments are more probable or silence.

  3. The ratio of noisy signal variance to noise signal variance

  4. Central limit theorem states that when independent random variables are added, the summation tends toward a Gaussian distribution regardless of the distribution of the original variables.

  5. In Souden et al. [32], it is shown that when the noise is a mixture of both coherent point source interference (e.g., non Gaussian babble, pink or factory noises) and non-coherent additive white noise, the SPP estimator is theoretically able to achieve an estimate close to one when speech is present. Interested readers are referred to Souden et al. [32] for theoretical proof.

  6. In the local case, there is no cooperation and consequently no transmitted signals between nodes, and each node only uses the recorded signals by its own microphones. Indeed, in this case instead of \( {\mathbf {y}}, {\varvec{\Phi }}_{{\mathbf {v}}},\) and \({\varvec{\Phi }}_{{\mathbf {x}}} \) in (14), the information of each node, i.e., \( {\mathbf {y}}_{k}, {\varvec{\Phi }}_{{\mathbf {v}}_{k}},\) and\( {\varvec{\Phi }}_{{\mathbf {x}}_{k}} \), are utilized to compute the SPP. Since the procedure is similar to that of CSPP and it is only required to replace the parameters, we explain this case briefly.

  7. Since the first microphone in the first node was considered as the reference microphone, the input full-band SNRs is computed for this microphone.

References

  1. J.B. Allen, D.A. Berkley, Image method for efficiently simulating small-room acoustics. Acoust. Soc. Am. J. 65, 943–950 (1979). https://doi.org/10.1121/1.382599

    Article  Google Scholar 

  2. A. Bertrand, M. Moonen, Distributed adaptive node-specific signal estimation in fully connected sensor networks—part I: sequential node updating. IEEE Trans. Signal Process. 58(10), 5277–5291 (2010). https://doi.org/10.1109/TSP.2010.2052612

    Article  MathSciNet  MATH  Google Scholar 

  3. A. Bertrand, M. Moonen, Distributed adaptive node-specific signal estimation in fully connected sensor networks—part II: simultaneous and asynchronous node updating. IEEE Trans. Signal Process. 58(10), 5292–5306 (2010). https://doi.org/10.1109/TSP.2010.2052613

    Article  MathSciNet  MATH  Google Scholar 

  4. I. Cohen, Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging. IEEE Trans. Speech Audio Process. 11(5), 466–475 (2003). https://doi.org/10.1109/TSA.2003.811544

    Article  Google Scholar 

  5. I. Cohen, B. Berdugo, Speech enhancement for non-stationary noise environments. Signal Process. 81(11), 2403–2418 (2001)

    Article  Google Scholar 

  6. S. Doclo, M. Moonen, T. Van den Bogaert, J. Wouters, Reduced-bandwidth and distributed MWF-based noise reduction algorithms for binaural hearing aids. IEEE Trans. Audio Speech Lang. Process. 17(1), 38–51 (2009). https://doi.org/10.1109/TASL.2008.2004291

    Article  Google Scholar 

  7. Y. Ephraim, D. Malah, Speech enhancement using a minimum mean-square error log-spectral amplitude estimator. IEEE Trans. Acoust. Speech Signal Process. 33(2), 443–445 (1985). https://doi.org/10.1109/TASSP.1985.1164550

    Article  Google Scholar 

  8. D. Fischer, S. Doclo, E.A.P. Habets, T. Gerkmann, Combined single-microphone Wiener and MVDR filtering based on speech interframe correlations and speech presence probability. in Proceedings of Speech Communication; 12. ITG Symposium, pp. 1–5 (2016)

  9. B. Fodor, T. Fingscheidt, MMSE speech enhancement under speech presence uncertainty assuming (generalized) Gamma speech priors throughout. in Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4033–4036 (2012). https://doi.org/10.1109/ICASSP.2012.6288803

  10. B. Fodor, T. Gerkmann, A posteriori speech presence probability estimation based on averaged observations and a super-Gaussian speech model. in Proceedings of International Workshop on Acoustic Signal Enhancement (IWAENC), pp. 11–15 (2014). https://doi.org/10.1109/IWAENC.2014.6953309

  11. B. Fodor, T. Gerkmann, A speech presence probability estimator based on fixed priors and a heavy-tailed speech model. in Proceedings of European Signal Processing Conference (EUSIPCO), pp. 2305–2309 (2014)

  12. J.S. Garofolo, Getting started with the DARPA TIMIT CD-ROM: an acoustic phonetic continuous speech database (Tech. rep, National Institute of Standards and Technology (NIST), Gaithersburgh, MD, 1988)

  13. T. Gerkmann, C. Breithaupt, R. Martin, Improved a posteriori speech presence probability estimation based on a likelihood ratio with fixed priors. IEEE Trans. Audio Speech Lang. Process. 16(5), 910–919 (2008). https://doi.org/10.1109/TASL.2008.921764

    Article  Google Scholar 

  14. T. Gerkmann, R.C. Hendriks, Unbiased MMSE-based noise power estimation with low complexity and low tracking delay. IEEE Trans. Audio Speech Lang. Process. 20(4), 1383–1393 (2012). https://doi.org/10.1109/TASL.2011.2180896

    Article  Google Scholar 

  15. T. Gerkmann, M. Krawczyk, R. Martin, Speech presence probability estimation based on temporal Cepstrum smoothing. in Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4254–4257 (2010). https://doi.org/10.1109/ICASSP.2010.5495677

  16. E.A.P. Habets, J. Benesty, I. Cohen, S. Gannot, J. Dmochowski, New insights into the MVDR beamformer in room acoustics. IEEE Trans. Audio Speech Lang. Process. 18(1), 158–170 (2010). https://doi.org/10.1109/TASL.2009.2024731

    Article  Google Scholar 

  17. A. Hassani, A. Bertrand, M. Moonen, GEVD-based low-rank approximation for distributed adaptive node-specific signal estimation in wireless sensor networks. IEEE Trans. Signal Process. 64(10), 2557–2572 (2016). https://doi.org/10.1109/TSP.2015.2510973

    Article  MathSciNet  MATH  Google Scholar 

  18. A.I. Koutrouvelis, T.W. Sherson, R. Heusdens, R.C. Hendriks, A low-cost robust distributed linearly constrained beamformer for wireless acoustic sensor networks with arbitrary topology. IEEE/ACM Trans. Audio Speech Lang. Process. 26(8), 1434–1448 (2018). https://doi.org/10.1109/TASLP.2018.2829405

    Article  Google Scholar 

  19. M. Krawczyk-Becker, D. Fischer, T. Gerkmann, Utilizing spectro-temporal correlations for an improved speech presence probability based noise power estimation. in Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 365–369 (2015). https://doi.org/10.1109/ICASSP.2015.7177992

  20. T.C. Lawin-Ore, S. Doclo, Analysis of the average performance of the multi-channel Wiener filter for distributed microphone arrays using statistical room acoustics. Signal Process. 107(C), 96–108 (2015)

    Article  Google Scholar 

  21. T.C. Lawin-Ore, S. Stenzel, J. Freudenberger, S. Doclo, Alternative formulation and robustness analysis of the multichannel Wiener filter for spatially distributed microphones. in Proceedings of International Workshop on Acoustic Signal Enhancement (IWAENC). Juan les Pins, France (2014). https://doi.org/10.1109/IWAENC.2014.6954008

  22. D. Malah, R.V. Cox, A.J. Accardi, Tracking speech-presence uncertainty to improve speech enhancement in non-stationary noise environments. in Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 789–792 (1999). https://doi.org/10.1109/ICASSP.1999.759789

  23. S. Markovich-Golan, A. Bertrand, M. Moonen, S. Gannot, Optimal distributed minimum-variance beamforming approaches for speech enhancement in wireless acoustic sensor networks. Signal Process. 107, 4–20 (2015). https://doi.org/10.1016/j.sigpro.2014.07.014

    Article  Google Scholar 

  24. S. Markovich-Golan, S. Gannot, I. Cohen, Distributed multiple constraints generalized sidelobe canceler for fully connected wireless acoustic sensor networks. IEEE Trans. Audio Speech Lang. Process. 21(2), 343–356 (2013). https://doi.org/10.1109/TASL.2012.2224454

    Article  Google Scholar 

  25. R. Martin, Speech enhancement using MMSE short time spectral estimation with Gamma distributed speech priors. in Proceedings International Conference on Acoustics, Speech and Signal Processing (ICASSP), vol. 1, pp. I–253–I–256 (2002). https://doi.org/10.1109/ICASSP.2002.5743702

  26. R. Martin, Speech enhancement based on minimum mean-square error estimation and super-Gaussian priors. IEEE Trans. Speech Audio Process. 13(5), 845–856 (2005). https://doi.org/10.1109/TSA.2005.851927

    Article  Google Scholar 

  27. R. Martin, C. Breithaupt, Speech enhancement in the DFT domain using Laplacian speech priors. in Proceedings of International Workshop on Acoustic Signal Enhancement (IWAENC) (2003)

  28. R. McAulay, M. Malpass, Speech enhancement using a soft-decision noise suppression filter. IEEE Trans. Acoust. Speech Signal Process. 28(2), 137–145 (1980). https://doi.org/10.1109/TASSP.1980.1163394

    Article  Google Scholar 

  29. H. Momeni, H.R. Abutalebi, E.A.P. Habets, Conditional MMSE-based single-channel speech enhancement using inter-frame and inter-band correlations. in Proceedings of International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5215–5219 (2016). https://doi.org/10.1109/ICASSP.2016.7472672

  30. K. Ngo, A. Spriet, M. Moonen, J. Wouters, S.H. Jensen, Incorporating the conditional speech presence probability in multi-channel Wiener filter based noise reduction in hearing aids. EURASIP J. Adv. Signal Process. (2009). https://doi.org/10.1155/2009/930625

    Article  MATH  Google Scholar 

  31. R. Ranjbaryan, S. Doclo, H.R. Abutalebi, Distributed MAP estimators for noise reduction in fully connected wireless acoustic sensor networks. in Proceedings of Speech Communication; 13th ITG-Symposium, pp. 1–5 (2018)

  32. M. Souden, J. Chen, J. Benesty, S. Affes, Gaussian model-based multichannel speech presence probability. IEEE Trans. Audio Speech Lang. Process. 18(5), 1072–1077 (2010). https://doi.org/10.1109/TASL.2009.2035150

    Article  Google Scholar 

  33. M. Souden, J. Chen, J. Benesty, S. Affes, An integrated solution for online multichannel noise tracking and reduction. IEEE Trans. Audio Speech Lang. Process. 19(7), 2159–2169 (2011). https://doi.org/10.1109/TASL.2011.2118205

    Article  Google Scholar 

  34. M. Taseska, E.A.P. Habets, Informed spatial filtering for sound extraction using distributed microphone arrays. IEEE/ACM Trans. Audio Speech Lang. Process. 22(7), 1195–1207 (2014). https://doi.org/10.1109/TASLP.2014.2327294

    Article  Google Scholar 

  35. V.M. Tavakoli, J.R. Jensen, M.G. Christensen, J. Benesty, A framework for speech enhancement with ad hoc microphone arrays. IEEE/ACM Trans. Audio Speech Lang Process. 24(6), 1038–1051 (2016). https://doi.org/10.1109/TASLP.2016.2537202

    Article  Google Scholar 

Download references

Acknowledgements

We would like to express our appreciation to Iran National Science Foundation (INSF) for supporting this work under Grant number 96000455. We are also grateful to the Department of Medical Physics and Acoustics, University of Oldenburg, for allowing access to their recorded data.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hamid Reza Abutalebi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ranjbaryan, R., Abutalebi, H.R. Distributed Speech Presence Probability Estimator in Fully Connected Wireless Acoustic Sensor Networks. Circuits Syst Signal Process 39, 6121–6141 (2020). https://doi.org/10.1007/s00034-020-01452-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00034-020-01452-4

Keywords

Navigation