Blind and Semi-blind Anechoic Mixing System Identification Using Multichannel Matching Pursuit

Haddad, Diego B.; Lovisolo, Lisandro; Petraglia, Mariane Rembold; Batalheiro, Paulo Bulkool; Filho, Jorge Costa Pires

doi:10.1007/s00034-021-01681-1

Blind and Semi-blind Anechoic Mixing System Identification Using Multichannel Matching Pursuit

Published: 09 March 2021

Volume 40, pages 4546–4575, (2021)
Cite this article

Circuits, Systems, and Signal Processing Aims and scope Submit manuscript

136 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

Sparse component analysis techniques have been successfully applied to the separation of speech sources. This paper presents an efficient algorithm based on the matching pursuit approach to deal with multichannel records. The proposed algorithm explicitly employs spatial constraints among different channels to express mixed signals as linear combinations of delayed components selected from an overcomplete dictionary. We present a new procedure for estimating the mixing system parameters (attenuations and delays), which can be applied to more than two mixtures and is not restricted to non-negative attenuation coefficients. The proposed mixing system estimation method can accommodate delays of greater magnitude than traditional approaches. In addition, learned dictionaries that improve the identification step can be used when excerpts from sources (exogenous to mixtures) are available. The simulation results show that semi-blind dictionaries perform better than those used in blind configurations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Underdetermined blind source separation of speech mixtures unifying dictionary learning and sparse representation

Article 03 August 2021

Separation of Multiple Speech Sources in Reverberant Environments Based on Sparse Component Enhancement

Article 17 May 2023

A Novel Underdetermined Source Recovery Algorithm Based on k-Sparse Component Analysis

Article 03 August 2018

Notes

The first row of \(\tilde{\varvec{H}}^{(1)}_{\mathcal {R}}\) is composed only by ones.
The subscript “ideal” indicates that this is the matrix expected to be returned by the system identification algorithm.

References

F. Abrard, Y. Deville, Blind separation of dependent sources using the time-frequency ratio of mixtures approach, in ISSPA (2003), pp. 1–4
F. Abrard, Y. Deville, A time-frequency blind signal separation method applicable to underdetermined mixtures of dependent sources. Signal Process. 85(7), 1389–1403 (2005)
Article MATH Google Scholar
F. Abrard, Y. Deville, P. White, From blind source separation to blind source cancellation in the underdetermined case: a new approach based on time-frequency analysis, in Proceedings of 3rd International Conference on Independent Component Analysis Signal Separation (ICA) (2001), pp. 734–739
M.E.M. Aharon, A. Bruckstein, K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans. Signal Process. 54(11), 4311–4322 (2006)
Article MATH Google Scholar
A. Aissa-El-Bey, N. Linh-Trung, K. Abed-Meraim, A. Belouchrani, Y. Grenier, Underdetermined blind separation of nondisjoint sources in the time-frequency domain. IEEE Trans. Signal Process. 55(3), 897–907 (2007)
Article MathSciNet MATH Google Scholar
S.I. Amari, T.P. Chen, A. Cichocki, Nonholonomic orthogonal learning algorithm for blind source separation. Neural Comput. 12(6), 1463–1484 (2000)
Article Google Scholar
S. Araki, H. Sawada, R. Mukai, S. Makino, Underdetermined sparse source separation of convolutive mixtures with observation vector clustering, in Proceedings of IEEE International Symposium on Circuits Systems (2006), pp. 3594–3597
G. Bao, Y. Xu, Z. Ye, Learning a discriminative dictionary for single-channel speech separation. IEEE/ACM Trans. Audio Speech Lang. Process. 22(7), 1130–1138 (2014). https://doi.org/10.1109/TASLP.2014.2320575
Article Google Scholar
T. Barker, T. Virtanen, N.H. Pontoppidan, Low-latency sound-source-separation using non-negative matrix factorization with coupled analysis and synthesis dictionaries, in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2015), pp. 241–245
L. Benaroya, F. Bimbot, R. Gribonval, Audio source separation with a single sensor. IEEE Trans. Audio Speech Lang. Process. 14(1), 191–199 (2006)
Article Google Scholar
P. Bofill, M. Zibulevsky, Underdetermined blind source separation using sparse representations. Signal Process. 81, 2353–2362 (2001)
Article MATH Google Scholar
F. Burkhardt, A. Paeschke, M. Rolfes, W.F. Sendlmeier, B. Weiss, A database of German emotional speech. Interspeech 5, 1517–1520 (2005)
Google Scholar
S.S. Chen, D.L. Donoho, M.A. Saunders, Atomic decomposition by basis pursuit. SIAM Rev. 43(1), 129–159 (2001)
Article MathSciNet MATH Google Scholar
C. Chenot, J. Bobin, J. Rapin, Robust sparse blind source separation. IEEE Signal Process. Lett. 22(11), 2172–2176 (2015)
Article Google Scholar
F.S.P. Clark, M.R. Petraglia, D.B. Haddad, A new initialization method for frequency-domain blind source separation algorithms. IEEE Signal Process. Lett. 18(6), 343–346 (2011)
Article Google Scholar
G. Davis, S. Mallat, M. Avellaneda, Adaptive greedy approximations. Constr. Approx. 13(1), 57–98 (1997)
Article MathSciNet MATH Google Scholar
G.A. de Oliveira, M.P. Tcheou, L. Lovisolo, Artificial neural networks for dictionary selection in adaptive greedy decomposition algorithms with reduced complexity, in 2018 International Joint Conference on Neural Networks (IJCNN) (IEEE, 2018), pp. 1–8
R.A. DeVore, V.N. Temlyakov, Some remarks on greedy algorithms. Adv. Comput. Math. 5(1), 173–187 (1996)
Article MathSciNet MATH Google Scholar
Z. Dong, W. Zhu, An improvement of the penalty decomposition method for sparse approximation. Signal Process. 113, 52–60 (2015)
Article Google Scholar
K. Engan, S.O. Aase, J.H. Husoy, Multi-frame compression: theory and design. EURASIP Signal Process. 80(10), 2121–2140 (2000)
Article MATH Google Scholar
S.E. Ferrando, L.A. Kolasa, N. Kovacevic, Algorithm 820: a flexible implementation of matching pursuit for gabor functions on the interval. ACM Trans. Math. Softw. 28(3), 337–353 (2002)
Article MATH Google Scholar
C. Févotte, S.J. Godsill, A bayesian approach for blind separation of sparse sources. IEEE Trans. Audio Speech Process. 14(6), 2174–2188 (2006)
Article MATH Google Scholar
J.H. Friedman, W. Stuetzle, Projection pursuit regression. J. Am. Stat. Assoc. 13(376), 435–475 (1981)
MathSciNet Google Scholar
S. Gannot, E. Vincent, S. Markovich-Golan, A. Ozerov, A consolidated perspective on multimicrophone speech enhancement and source separation. IEEE/ACM Trans. Audio Speech Lang. Process. 25(4), 692–730 (2017)
Article Google Scholar
P. Georgiev, F. Theis, A. Cichocki, Sparse component analysis and blind source separation of underdetermined mixtures. IEEE Trans. Neural Netw. 16(4), 992–996 (2005)
Article Google Scholar
S. Goel, A. Verma, S. Goel, K. Juneja, ICA in image processing: a survey, in 2015 IEEE International Conference on Computational Intelligence and Communication Technology (CICT) (2015), pp. 144–149
M.M. Goodwin, Multichannel matching pursuit and applications to spatial audio coding, in 2006 Fortieth Asilomar Conference on Signals, Systems and Computers (IEEE, 2006), pp. 1114–1118
B.V. Gowreesunker, A.H. Tewfik, Dictionary and sparse decomposition method selection for underdetermined blind source separation, in EUSIPCO (2007), pp. 768–772
R. Gribonval, Sparse decomposition of stereo signals with matching pursuit and application to blind separation of more than two sources from a stereo mixture, in International Conference on Acoustic, Speech, and Signal Processing, vol. 3 (2002), pp. 3057–3060
R. Gribonval, M. Zibulevsky, Sparse component analysis, in Handbook of Blind Source Separation, ed. by P. Comon, C. Jutten (Elsevier, Amsterdam, 2010), pp. 367–420
Chapter Google Scholar
D.B. Haddad, Estruturas em subbandas para filtragem adaptativa e separação cega e semi-cega de sinais de voz. Ph.D. thesis, UFRJ/COPPE (2013)
C. Hesse, C. James, On semi-blind source separation using spatial constraints with applications in EEG analysis. IEEE Trans. Biomed. Eng. 53(12), 2525–2534 (2006)
Article Google Scholar
P.S. Huang, M. Kim, M. Hasegawa-Johnson, P. Smaragdis, Deep learning for monaural speech separation, in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2014), pp. 1562–1566
A. Hyvärinen, J. Karhunen, E. Oja, Independent Component Analysis, 1st edn. (Wiley, New York, 2001)
Book Google Scholar
K. Itoyama, M. Goto, K. Komatani, T. Ogata, H.G. Okuno, Simultaneous processing of sound source separation and musical instrument identification using bayesian spectral modeling, in Proceedings of IEEE International Conference on Acoustic, Speech, and Signal Processing (2011), pp. 3816–3819
M.G. Jafari, M.D. Plumbley, Fast dictionary learning for sparse representations of speech signals. IEEE J. Sel. Top. Signal Process. 5(5), 1025–1031 (2011)
Article Google Scholar
X. Jaureguiberry, E. Vincent, G. Richard, Fusion methods for speech enhancement and audio source separation. IEEE/ACM Trans. Audio Speech Lang. Process. 24(7), 1266–1279 (2016)
Article Google Scholar
M. Jia, J. Sun, C. Bao, C. Ritz, Separation of multiple speech sources by recovering sparse and non-sparse components from b-format microphone recordings. Speech Commun. 96, 184–196 (2018)
Article Google Scholar
P. Jost, P. Vandergheynst, P. Frossard, Tree-based pursuit: algorithm and properties. IEEE Trans. Signal Process. 54(12), 4685–4697 (2006)
Article MATH Google Scholar
P. Kabal, Tsp speech database, McGill University, Database Version, vol. 1, 09–02 (2002)
D.D. Lee, H.S. Seung, Learning the parts of objects by non-negative matrix factorization. Nature 401, 788–791 (1999)
Article MATH Google Scholar
S. Leglaive, R. Badeau, G. Richard, Semi-blind student’s t source separation for multichannel audio convolutive mixtures, in 2017 25th European Signal Processing Conference (EUSIPCO) (2017), pp. 2259–2263
S. Lesage, S. Krstulovic, R. Gribonval, Underdetermined source separation: comparison of two approaches based on sparse decompositions, in ICA (2006), pp. 633–640
Y. Li, S. Amari, A. Cichocki, D.W.C. Ho, S. Xie, Underdetermined blind source separation based on sparse representation. IEEE Trans. Signal Process. 54(2), 423–437 (2006)
Article MATH Google Scholar
B. Liu, V.G. Reju, A.W.H. Khong, Underdetermined instantaneous blind source separation of sparse signals with temporal structure using the state-space model, in 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing (2013), pp. 81–85
L. Lovisolo, E.A.B. da Silva, P.S.R. Diniz, On the statistics of matching pursuit angles. Signal Process. 90, 3164–3184 (2010)
Article MATH Google Scholar
L. Lovisolo, E.A.B. da Silva, M.A.M. Rodrigues, P.S.R. Diniz, Efficient coherent adaptive representations of monitored electric signals in power systems using damped sinusoids. IEEE Trans. Signal Process. 53(10), 3831–3846 (2005)
Article MathSciNet MATH Google Scholar
L. Lovisolo, M.P. Tcheou, E.A.B. da Silva, M.A.M. Rodrigues, P.S.R. Diniz, Modeling of electric disturbance signals using damped sinusoids via atomic decompositions and its applications. EURASIP J. Appl. Signal Process. 2007, 1–16 (2007)
MATH Google Scholar
S. Mallat, A Wavelet Tour of Signal Processing (Academic Press, London, 1999)
MATH Google Scholar
S.G. Mallat, Z. Zhang, Matching pursuits with time-frequency dictionaries. IEEE Trans. Signal Process. 41(12), 3397–3415 (1993)
Article MATH Google Scholar
J. Ming, R. Srinivasan, D. Crookes, A. Jafari, Close—a data-driven approach to speech separation. IEEE Trans. Audio Speech Lang. Process. 21(7), 1355–1368 (2013)
Article Google Scholar
L.A. Muth, C.M. Wang, T. Conn, Robust separation of background and target signals in radar cross section measurements. IEEE Trans. Instrum. Meas. 54(6), 2462–2468 (2005)
Article Google Scholar
F. Nesta, M. Fakhry, Unsupervised spatial dictionary learning for sparse underdetermined multichannel source separation, in ICASSP (2013), pp. 86–90
C.I. Nieblas, M.A. Alonso, R. Conte, S. Villareal, High performance heart sound segmentation algorithm based on matching pursuit, in Proceedings of DSP/SPE Workshop (2013), pp. 96–100
M. Parvaix, L. Girin, Informed source separation of linear instantaneous under-determined audio mixtures by source index embedding. IEEE Trans. Audio Speech Lang. Process. 19(6), 1721–1733 (2011)
Article Google Scholar
M.S. Pedersen, D. Wang, J. Larsen, U. Kjems, Two-microphone separation of speech mixtures. IEEE Trans. Neural Netw. 19(3), 475–492 (2013)
Article Google Scholar
T. Peel, V. Emiya, L. Ralaivola, Matching pursuit with stochastic selection, in Proceedings of European Signal Processing Conference (2012), pp. 879–883
M. Puigt, Y. Deville, Time-frequency ratio-based blind separation methods for attenuated and time-delayed sources. Mech. Syst. Signal Process. 19(6), 1348–1379 (2005)
Article Google Scholar
S. Qian, Introduction to Time-Frequency and Wavelet Transforms, vol. 68 (Prentice Hall PTR, Upper Saddle River, 2002)
Google Scholar
W. Rafique, S.M. Naqvi, P.J. Jackson, J.A. Chambers, Iva algorithms using a multivariate student’s t source prior for speech source separation in real room environments, in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (2015), pp. 474–478
R. Rehr, T. Gerkmann, On the importance of super-gaussian speech priors for machine-learning based speech enhancement. IEEE/ACM Trans. Audio Speech Lang. Process. 26(2), 357–366 (2018). https://doi.org/10.1109/TASLP.2017.2778151
W. Ren, G. Li, D. Tu, L. Jia, Nonnegative matrix factorization with regularizations. IEEE J. Emerg. Sel. Top. Circuits Syst. 4(1), 153–164 (2014)
Article Google Scholar
S. Rickard, The duet blind source separation algorithm, in Blind Speech Separation, Signals and Communication Technology, ed. by S. Makino, H. Sawada, T.W. Lee (Springer, Amsterdam, 2007), pp. 217–241
Chapter Google Scholar
R. Rubinstein, A.M. Bruckstein, M. Elad, Dictionaries for sparse representation modeling. Proc. IEEE 98(6), 1045–1057 (2010)
Article Google Scholar
R. Rubinstein, T. Peleg, M. Elad, Analysis K-SVD: a dictionary–learning algorithm for the analysis sparse model. IEEE Trans. Signal Process. 61(3), 661–677 (2013)
Article MathSciNet MATH Google Scholar
Z. Sadeghipoor, M. Babaie-Zadeh, Dictionary learning for sparse decomposition: a new criterion and algorithm, in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (2013), pp. 5855–5859
S.E. Selvan, Nonsmooth ica contrast minimization using a riemannian Nelder–Mead method. IEEE Trans. Neural Netw. Learn. Syst. 26(1), 177–183 (2015)
Article MathSciNet Google Scholar
B.L. Sturm, J.J. Shynk, Sparse approximation and the pursuit of meaningful signal models with interference adaptation. IEEE Trans. Audio Speech Lang. Process. 18(3), 461–472 (2010)
Article Google Scholar
B.L. Sturm, J.J. Shynk, L. Daudet, C. Roads, Dark energy in sparse atomic estimations. IEEE Trans. Audio Speech Lang. Process. 16(3), 671–676 (2008)
Article Google Scholar
B.L. Sturm, J.J. Shynk, S. Gauglitz, Agglomerative clustering in sparse atomic decompositions of audio signals, in International Conference on Acoustic, Speech, and Signal Processing (ICASSP) (2008), pp. 97–100
P. Sugden, N. Canagarajah, Underdetermined noisy blind separation using dual matching pursuits, in International Conference on Acoustic, Speech, and Signal Processing, vol. V, pp. 557–560 (2004)
M.P. Tcheou, Compressão de sinais usando decomposições atômicas com base em dicionários redundantes. Ph.D. thesis, Universidade Federal do Rio de Janeiro (2011)
Y. Tian, X. Sun, S. Zhao, Doa and power estimation using a sparse representation of second-order statistics vector and \(l_0\)-norm approximation. Signal Process. 105, 98–108 (2014)
Article Google Scholar
R. Tibshirani, Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B 58(1), 267–288 (1996)
MathSciNet MATH Google Scholar
I. Tosic, P. Frossard, Dictionary learning. IEEE Signal Process. Mag. 28(2), 27–38 (2011)
Article MATH Google Scholar
J.A. Tropp, Greed is good: algorithmic results for sparse approximation. IEEE Trans. Inf. Theory 50(10), 2231–2242 (2004)
Article MathSciNet MATH Google Scholar
J.A. Tropp, Just relax: convex programming methods for identifying sparse signals. IEEE Trans. Inf. Theory 52(3), 1030–1051 (2006)
Article MathSciNet MATH Google Scholar
C.D. Vleeschouwer, B. Macq, Subband dictionaries for low-cost matching pursuit of video residues. IEEE Trans. Circuits Syst. Video Technol. 9(7), 984–993 (1999)
Article Google Scholar
S.U.N. Wood, J. Rouat, S. Dupont, G. Pironkov, Blind speech separation and enhancement with GCC-NMF. IEEE/ACM Trans. Audio Speech Lang. Process. 25(4), 745–755 (2017)
Article Google Scholar
T. Xu, W. Wang, W. Dai, Sparse coding with adaptive dictionary learning for underdetermined blind speech separation. Speech Commun. 55, 432–450 (2013)
Article Google Scholar
J. Yamashita, S. Tatsuta, Y. Hirai, Estimation of propagation delays using orientation histograms of anechoic blind source separation. IJCNN 3, 2175–2180 (2004)
Google Scholar
Ö. Yilmaz, S. Rickard, Blind separation of speech mixtures via time-frequency masking. IEEE Trans. Signal Process. 52(7), 1830–1847 (2004)
Article MathSciNet MATH Google Scholar
L. Zhang, Q. Zhang, L. Zhang, D. Tao, X. Huang, B. Du, Ensemble manifold regularized sparse low-rank approximation for multiview feature embedding. Pattern Recognit. 48(10), 3102–3112 (2015)
Article Google Scholar
X. Zhang, D. Wang, Deep learning based binaural speech separation in reverberant environments. IEEE/ACM Trans. Audio Speech Lang. Process. 25(5), 1075–1084 (2017)
Article Google Scholar
M. Zibulevsky, B.A. Pearlmutter, Blind source separation by sparse decomposition. Neural Comput. 13(4), 862–882 (2001)
Article MATH Google Scholar

Download references

Acknowledgements

Funding was provided by CNPq (Grant No. 431215/2016-2).

Author information

Authors and Affiliations

Centro Federal de Educação Tecnológica Celso Suckow da Fonseca, Petrópolis, RJ, Brazil
Diego B. Haddad
Universidade do Estado do Rio de Janeiro (UERJ), Rio de Janeiro, Brazil
Lisandro Lovisolo
Programa de Engenharia Elétrica, Instituto Alberto Luiz Coimbra de Pós-Graduação e Pesquisa de Engenharia, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil
Mariane Rembold Petraglia
Universidade do Estado do Rio de Janeiro, Rio de Janeiro, Brazil
Paulo Bulkool Batalheiro
Instituto de Pesquisas da Marinha (IPqM), Rio de Janeiro, Brazil
Jorge Costa Pires Filho

Authors

Diego B. Haddad
View author publications
You can also search for this author in PubMed Google Scholar
Lisandro Lovisolo
View author publications
You can also search for this author in PubMed Google Scholar
Mariane Rembold Petraglia
View author publications
You can also search for this author in PubMed Google Scholar
Paulo Bulkool Batalheiro
View author publications
You can also search for this author in PubMed Google Scholar
Jorge Costa Pires Filho
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Diego B. Haddad.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was supported in part by Conselho Nacional de Desenvolvimento Científico e Tecnológico and in part by Fundação Carlos Chagas Filho de Amparo a Pesquisa do Estado do Rio de Janeiro.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Haddad, D.B., Lovisolo, L., Petraglia, M.R. et al. Blind and Semi-blind Anechoic Mixing System Identification Using Multichannel Matching Pursuit. Circuits Syst Signal Process 40, 4546–4575 (2021). https://doi.org/10.1007/s00034-021-01681-1

Download citation

Received: 16 December 2019
Revised: 08 February 2021
Accepted: 10 February 2021
Published: 09 March 2021
Issue Date: September 2021
DOI: https://doi.org/10.1007/s00034-021-01681-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Blind and Semi-blind Anechoic Mixing System Identification Using Multichannel Matching Pursuit

Abstract

Access this article

Similar content being viewed by others

Underdetermined blind source separation of speech mixtures unifying dictionary learning and sparse representation

Separation of Multiple Speech Sources in Reverberant Environments Based on Sparse Component Enhancement

A Novel Underdetermined Source Recovery Algorithm Based on k-Sparse Component Analysis

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Blind and Semi-blind Anechoic Mixing System Identification Using Multichannel Matching Pursuit

Abstract

Access this article

Similar content being viewed by others

Underdetermined blind source separation of speech mixtures unifying dictionary learning and sparse representation

Separation of Multiple Speech Sources in Reverberant Environments Based on Sparse Component Enhancement

A Novel Underdetermined Source Recovery Algorithm Based on k-Sparse Component Analysis

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation