Skip to main content
Log in

Blind and Semi-blind Anechoic Mixing System Identification Using Multichannel Matching Pursuit

  • Published:
Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

Abstract

Sparse component analysis techniques have been successfully applied to the separation of speech sources. This paper presents an efficient algorithm based on the matching pursuit approach to deal with multichannel records. The proposed algorithm explicitly employs spatial constraints among different channels to express mixed signals as linear combinations of delayed components selected from an overcomplete dictionary. We present a new procedure for estimating the mixing system parameters (attenuations and delays), which can be applied to more than two mixtures and is not restricted to non-negative attenuation coefficients. The proposed mixing system estimation method can accommodate delays of greater magnitude than traditional approaches. In addition, learned dictionaries that improve the identification step can be used when excerpts from sources (exogenous to mixtures) are available. The simulation results show that semi-blind dictionaries perform better than those used in blind configurations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. The first row of \(\tilde{\varvec{H}}^{(1)}_{\mathcal {R}}\) is composed only by ones.

  2. The subscript “ideal” indicates that this is the matrix expected to be returned by the system identification algorithm.

References

  1. F. Abrard, Y. Deville, Blind separation of dependent sources using the time-frequency ratio of mixtures approach, in ISSPA (2003), pp. 1–4

  2. F. Abrard, Y. Deville, A time-frequency blind signal separation method applicable to underdetermined mixtures of dependent sources. Signal Process. 85(7), 1389–1403 (2005)

    Article  MATH  Google Scholar 

  3. F. Abrard, Y. Deville, P. White, From blind source separation to blind source cancellation in the underdetermined case: a new approach based on time-frequency analysis, in Proceedings of 3rd International Conference on Independent Component Analysis Signal Separation (ICA) (2001), pp. 734–739

  4. M.E.M. Aharon, A. Bruckstein, K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans. Signal Process. 54(11), 4311–4322 (2006)

    Article  MATH  Google Scholar 

  5. A. Aissa-El-Bey, N. Linh-Trung, K. Abed-Meraim, A. Belouchrani, Y. Grenier, Underdetermined blind separation of nondisjoint sources in the time-frequency domain. IEEE Trans. Signal Process. 55(3), 897–907 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  6. S.I. Amari, T.P. Chen, A. Cichocki, Nonholonomic orthogonal learning algorithm for blind source separation. Neural Comput. 12(6), 1463–1484 (2000)

    Article  Google Scholar 

  7. S. Araki, H. Sawada, R. Mukai, S. Makino, Underdetermined sparse source separation of convolutive mixtures with observation vector clustering, in Proceedings of IEEE International Symposium on Circuits Systems (2006), pp. 3594–3597

  8. G. Bao, Y. Xu, Z. Ye, Learning a discriminative dictionary for single-channel speech separation. IEEE/ACM Trans. Audio Speech Lang. Process. 22(7), 1130–1138 (2014). https://doi.org/10.1109/TASLP.2014.2320575

    Article  Google Scholar 

  9. T. Barker, T. Virtanen, N.H. Pontoppidan, Low-latency sound-source-separation using non-negative matrix factorization with coupled analysis and synthesis dictionaries, in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2015), pp. 241–245

  10. L. Benaroya, F. Bimbot, R. Gribonval, Audio source separation with a single sensor. IEEE Trans. Audio Speech Lang. Process. 14(1), 191–199 (2006)

    Article  Google Scholar 

  11. P. Bofill, M. Zibulevsky, Underdetermined blind source separation using sparse representations. Signal Process. 81, 2353–2362 (2001)

    Article  MATH  Google Scholar 

  12. F. Burkhardt, A. Paeschke, M. Rolfes, W.F. Sendlmeier, B. Weiss, A database of German emotional speech. Interspeech 5, 1517–1520 (2005)

    Google Scholar 

  13. S.S. Chen, D.L. Donoho, M.A. Saunders, Atomic decomposition by basis pursuit. SIAM Rev. 43(1), 129–159 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  14. C. Chenot, J. Bobin, J. Rapin, Robust sparse blind source separation. IEEE Signal Process. Lett. 22(11), 2172–2176 (2015)

    Article  Google Scholar 

  15. F.S.P. Clark, M.R. Petraglia, D.B. Haddad, A new initialization method for frequency-domain blind source separation algorithms. IEEE Signal Process. Lett. 18(6), 343–346 (2011)

    Article  Google Scholar 

  16. G. Davis, S. Mallat, M. Avellaneda, Adaptive greedy approximations. Constr. Approx. 13(1), 57–98 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  17. G.A. de Oliveira, M.P. Tcheou, L. Lovisolo, Artificial neural networks for dictionary selection in adaptive greedy decomposition algorithms with reduced complexity, in 2018 International Joint Conference on Neural Networks (IJCNN) (IEEE, 2018), pp. 1–8

  18. R.A. DeVore, V.N. Temlyakov, Some remarks on greedy algorithms. Adv. Comput. Math. 5(1), 173–187 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  19. Z. Dong, W. Zhu, An improvement of the penalty decomposition method for sparse approximation. Signal Process. 113, 52–60 (2015)

    Article  Google Scholar 

  20. K. Engan, S.O. Aase, J.H. Husoy, Multi-frame compression: theory and design. EURASIP Signal Process. 80(10), 2121–2140 (2000)

    Article  MATH  Google Scholar 

  21. S.E. Ferrando, L.A. Kolasa, N. Kovacevic, Algorithm 820: a flexible implementation of matching pursuit for gabor functions on the interval. ACM Trans. Math. Softw. 28(3), 337–353 (2002)

    Article  MATH  Google Scholar 

  22. C. Févotte, S.J. Godsill, A bayesian approach for blind separation of sparse sources. IEEE Trans. Audio Speech Process. 14(6), 2174–2188 (2006)

    Article  MATH  Google Scholar 

  23. J.H. Friedman, W. Stuetzle, Projection pursuit regression. J. Am. Stat. Assoc. 13(376), 435–475 (1981)

    MathSciNet  Google Scholar 

  24. S. Gannot, E. Vincent, S. Markovich-Golan, A. Ozerov, A consolidated perspective on multimicrophone speech enhancement and source separation. IEEE/ACM Trans. Audio Speech Lang. Process. 25(4), 692–730 (2017)

    Article  Google Scholar 

  25. P. Georgiev, F. Theis, A. Cichocki, Sparse component analysis and blind source separation of underdetermined mixtures. IEEE Trans. Neural Netw. 16(4), 992–996 (2005)

    Article  Google Scholar 

  26. S. Goel, A. Verma, S. Goel, K. Juneja, ICA in image processing: a survey, in 2015 IEEE International Conference on Computational Intelligence and Communication Technology (CICT) (2015), pp. 144–149

  27. M.M. Goodwin, Multichannel matching pursuit and applications to spatial audio coding, in 2006 Fortieth Asilomar Conference on Signals, Systems and Computers (IEEE, 2006), pp. 1114–1118

  28. B.V. Gowreesunker, A.H. Tewfik, Dictionary and sparse decomposition method selection for underdetermined blind source separation, in EUSIPCO (2007), pp. 768–772

  29. R. Gribonval, Sparse decomposition of stereo signals with matching pursuit and application to blind separation of more than two sources from a stereo mixture, in International Conference on Acoustic, Speech, and Signal Processing, vol. 3 (2002), pp. 3057–3060

  30. R. Gribonval, M. Zibulevsky, Sparse component analysis, in Handbook of Blind Source Separation, ed. by P. Comon, C. Jutten (Elsevier, Amsterdam, 2010), pp. 367–420

    Chapter  Google Scholar 

  31. D.B. Haddad, Estruturas em subbandas para filtragem adaptativa e separação cega e semi-cega de sinais de voz. Ph.D. thesis, UFRJ/COPPE (2013)

  32. C. Hesse, C. James, On semi-blind source separation using spatial constraints with applications in EEG analysis. IEEE Trans. Biomed. Eng. 53(12), 2525–2534 (2006)

    Article  Google Scholar 

  33. P.S. Huang, M. Kim, M. Hasegawa-Johnson, P. Smaragdis, Deep learning for monaural speech separation, in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2014), pp. 1562–1566

  34. A. Hyvärinen, J. Karhunen, E. Oja, Independent Component Analysis, 1st edn. (Wiley, New York, 2001)

    Book  Google Scholar 

  35. K. Itoyama, M. Goto, K. Komatani, T. Ogata, H.G. Okuno, Simultaneous processing of sound source separation and musical instrument identification using bayesian spectral modeling, in Proceedings of IEEE International Conference on Acoustic, Speech, and Signal Processing (2011), pp. 3816–3819

  36. M.G. Jafari, M.D. Plumbley, Fast dictionary learning for sparse representations of speech signals. IEEE J. Sel. Top. Signal Process. 5(5), 1025–1031 (2011)

    Article  Google Scholar 

  37. X. Jaureguiberry, E. Vincent, G. Richard, Fusion methods for speech enhancement and audio source separation. IEEE/ACM Trans. Audio Speech Lang. Process. 24(7), 1266–1279 (2016)

    Article  Google Scholar 

  38. M. Jia, J. Sun, C. Bao, C. Ritz, Separation of multiple speech sources by recovering sparse and non-sparse components from b-format microphone recordings. Speech Commun. 96, 184–196 (2018)

    Article  Google Scholar 

  39. P. Jost, P. Vandergheynst, P. Frossard, Tree-based pursuit: algorithm and properties. IEEE Trans. Signal Process. 54(12), 4685–4697 (2006)

    Article  MATH  Google Scholar 

  40. P. Kabal, Tsp speech database, McGill University, Database Version, vol. 1, 09–02 (2002)

  41. D.D. Lee, H.S. Seung, Learning the parts of objects by non-negative matrix factorization. Nature 401, 788–791 (1999)

    Article  MATH  Google Scholar 

  42. S. Leglaive, R. Badeau, G. Richard, Semi-blind student’s t source separation for multichannel audio convolutive mixtures, in 2017 25th European Signal Processing Conference (EUSIPCO) (2017), pp. 2259–2263

  43. S. Lesage, S. Krstulovic, R. Gribonval, Underdetermined source separation: comparison of two approaches based on sparse decompositions, in ICA (2006), pp. 633–640

  44. Y. Li, S. Amari, A. Cichocki, D.W.C. Ho, S. Xie, Underdetermined blind source separation based on sparse representation. IEEE Trans. Signal Process. 54(2), 423–437 (2006)

    Article  MATH  Google Scholar 

  45. B. Liu, V.G. Reju, A.W.H. Khong, Underdetermined instantaneous blind source separation of sparse signals with temporal structure using the state-space model, in 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing (2013), pp. 81–85

  46. L. Lovisolo, E.A.B. da Silva, P.S.R. Diniz, On the statistics of matching pursuit angles. Signal Process. 90, 3164–3184 (2010)

    Article  MATH  Google Scholar 

  47. L. Lovisolo, E.A.B. da Silva, M.A.M. Rodrigues, P.S.R. Diniz, Efficient coherent adaptive representations of monitored electric signals in power systems using damped sinusoids. IEEE Trans. Signal Process. 53(10), 3831–3846 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  48. L. Lovisolo, M.P. Tcheou, E.A.B. da Silva, M.A.M. Rodrigues, P.S.R. Diniz, Modeling of electric disturbance signals using damped sinusoids via atomic decompositions and its applications. EURASIP J. Appl. Signal Process. 2007, 1–16 (2007)

    MATH  Google Scholar 

  49. S. Mallat, A Wavelet Tour of Signal Processing (Academic Press, London, 1999)

    MATH  Google Scholar 

  50. S.G. Mallat, Z. Zhang, Matching pursuits with time-frequency dictionaries. IEEE Trans. Signal Process. 41(12), 3397–3415 (1993)

    Article  MATH  Google Scholar 

  51. J. Ming, R. Srinivasan, D. Crookes, A. Jafari, Close—a data-driven approach to speech separation. IEEE Trans. Audio Speech Lang. Process. 21(7), 1355–1368 (2013)

    Article  Google Scholar 

  52. L.A. Muth, C.M. Wang, T. Conn, Robust separation of background and target signals in radar cross section measurements. IEEE Trans. Instrum. Meas. 54(6), 2462–2468 (2005)

    Article  Google Scholar 

  53. F. Nesta, M. Fakhry, Unsupervised spatial dictionary learning for sparse underdetermined multichannel source separation, in ICASSP (2013), pp. 86–90

  54. C.I. Nieblas, M.A. Alonso, R. Conte, S. Villareal, High performance heart sound segmentation algorithm based on matching pursuit, in Proceedings of DSP/SPE Workshop (2013), pp. 96–100

  55. M. Parvaix, L. Girin, Informed source separation of linear instantaneous under-determined audio mixtures by source index embedding. IEEE Trans. Audio Speech Lang. Process. 19(6), 1721–1733 (2011)

    Article  Google Scholar 

  56. M.S. Pedersen, D. Wang, J. Larsen, U. Kjems, Two-microphone separation of speech mixtures. IEEE Trans. Neural Netw. 19(3), 475–492 (2013)

    Article  Google Scholar 

  57. T. Peel, V. Emiya, L. Ralaivola, Matching pursuit with stochastic selection, in Proceedings of European Signal Processing Conference (2012), pp. 879–883

  58. M. Puigt, Y. Deville, Time-frequency ratio-based blind separation methods for attenuated and time-delayed sources. Mech. Syst. Signal Process. 19(6), 1348–1379 (2005)

    Article  Google Scholar 

  59. S. Qian, Introduction to Time-Frequency and Wavelet Transforms, vol. 68 (Prentice Hall PTR, Upper Saddle River, 2002)

    Google Scholar 

  60. W. Rafique, S.M. Naqvi, P.J. Jackson, J.A. Chambers, Iva algorithms using a multivariate student’s t source prior for speech source separation in real room environments, in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2015), pp. 474–478

  61. R. Rehr, T. Gerkmann, On the importance of super-gaussian speech priors for machine-learning based speech enhancement. IEEE/ACM Trans. Audio Speech Lang. Process. 26(2), 357–366 (2018). https://doi.org/10.1109/TASLP.2017.2778151

  62. W. Ren, G. Li, D. Tu, L. Jia, Nonnegative matrix factorization with regularizations. IEEE J. Emerg. Sel. Top. Circuits Syst. 4(1), 153–164 (2014)

    Article  Google Scholar 

  63. S. Rickard, The duet blind source separation algorithm, in Blind Speech Separation, Signals and Communication Technology, ed. by S. Makino, H. Sawada, T.W. Lee (Springer, Amsterdam, 2007), pp. 217–241

    Chapter  Google Scholar 

  64. R. Rubinstein, A.M. Bruckstein, M. Elad, Dictionaries for sparse representation modeling. Proc. IEEE 98(6), 1045–1057 (2010)

    Article  Google Scholar 

  65. R. Rubinstein, T. Peleg, M. Elad, Analysis K-SVD: a dictionary–learning algorithm for the analysis sparse model. IEEE Trans. Signal Process. 61(3), 661–677 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  66. Z. Sadeghipoor, M. Babaie-Zadeh, Dictionary learning for sparse decomposition: a new criterion and algorithm, in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (2013), pp. 5855–5859

  67. S.E. Selvan, Nonsmooth ica contrast minimization using a riemannian Nelder–Mead method. IEEE Trans. Neural Netw. Learn. Syst. 26(1), 177–183 (2015)

    Article  MathSciNet  Google Scholar 

  68. B.L. Sturm, J.J. Shynk, Sparse approximation and the pursuit of meaningful signal models with interference adaptation. IEEE Trans. Audio Speech Lang. Process. 18(3), 461–472 (2010)

    Article  Google Scholar 

  69. B.L. Sturm, J.J. Shynk, L. Daudet, C. Roads, Dark energy in sparse atomic estimations. IEEE Trans. Audio Speech Lang. Process. 16(3), 671–676 (2008)

    Article  Google Scholar 

  70. B.L. Sturm, J.J. Shynk, S. Gauglitz, Agglomerative clustering in sparse atomic decompositions of audio signals, in International Conference on Acoustic, Speech, and Signal Processing (ICASSP) (2008), pp. 97–100

  71. P. Sugden, N. Canagarajah, Underdetermined noisy blind separation using dual matching pursuits, in International Conference on Acoustic, Speech, and Signal Processing, vol. V, pp. 557–560 (2004)

  72. M.P. Tcheou, Compressão de sinais usando decomposições atômicas com base em dicionários redundantes. Ph.D. thesis, Universidade Federal do Rio de Janeiro (2011)

  73. Y. Tian, X. Sun, S. Zhao, Doa and power estimation using a sparse representation of second-order statistics vector and \(l_0\)-norm approximation. Signal Process. 105, 98–108 (2014)

    Article  Google Scholar 

  74. R. Tibshirani, Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B 58(1), 267–288 (1996)

    MathSciNet  MATH  Google Scholar 

  75. I. Tosic, P. Frossard, Dictionary learning. IEEE Signal Process. Mag. 28(2), 27–38 (2011)

    Article  MATH  Google Scholar 

  76. J.A. Tropp, Greed is good: algorithmic results for sparse approximation. IEEE Trans. Inf. Theory 50(10), 2231–2242 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  77. J.A. Tropp, Just relax: convex programming methods for identifying sparse signals. IEEE Trans. Inf. Theory 52(3), 1030–1051 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  78. C.D. Vleeschouwer, B. Macq, Subband dictionaries for low-cost matching pursuit of video residues. IEEE Trans. Circuits Syst. Video Technol. 9(7), 984–993 (1999)

    Article  Google Scholar 

  79. S.U.N. Wood, J. Rouat, S. Dupont, G. Pironkov, Blind speech separation and enhancement with GCC-NMF. IEEE/ACM Trans. Audio Speech Lang. Process. 25(4), 745–755 (2017)

    Article  Google Scholar 

  80. T. Xu, W. Wang, W. Dai, Sparse coding with adaptive dictionary learning for underdetermined blind speech separation. Speech Commun. 55, 432–450 (2013)

    Article  Google Scholar 

  81. J. Yamashita, S. Tatsuta, Y. Hirai, Estimation of propagation delays using orientation histograms of anechoic blind source separation. IJCNN 3, 2175–2180 (2004)

    Google Scholar 

  82. Ö. Yilmaz, S. Rickard, Blind separation of speech mixtures via time-frequency masking. IEEE Trans. Signal Process. 52(7), 1830–1847 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  83. L. Zhang, Q. Zhang, L. Zhang, D. Tao, X. Huang, B. Du, Ensemble manifold regularized sparse low-rank approximation for multiview feature embedding. Pattern Recognit. 48(10), 3102–3112 (2015)

    Article  Google Scholar 

  84. X. Zhang, D. Wang, Deep learning based binaural speech separation in reverberant environments. IEEE/ACM Trans. Audio Speech Lang. Process. 25(5), 1075–1084 (2017)

    Article  Google Scholar 

  85. M. Zibulevsky, B.A. Pearlmutter, Blind source separation by sparse decomposition. Neural Comput. 13(4), 862–882 (2001)

    Article  MATH  Google Scholar 

Download references

Acknowledgements

Funding was provided by CNPq (Grant No. 431215/2016-2).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Diego B. Haddad.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was supported in part by Conselho Nacional de Desenvolvimento Científico e Tecnológico and in part by Fundação Carlos Chagas Filho de Amparo a Pesquisa do Estado do Rio de Janeiro.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Haddad, D.B., Lovisolo, L., Petraglia, M.R. et al. Blind and Semi-blind Anechoic Mixing System Identification Using Multichannel Matching Pursuit. Circuits Syst Signal Process 40, 4546–4575 (2021). https://doi.org/10.1007/s00034-021-01681-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00034-021-01681-1

Keywords

Navigation