Adaptive and hybrid Kronecker product beamforming for far-field speech signals
Introduction
Beamforming is the task of conserving a signal received by an array of sensors from a particular direction and source while trying to attenuate the interferences and noise-signals impinging on it from other directions and sources (Van Veen, Buckley, 1988, Wang, 2009, Benesty, Cohen, Chen, 2017). It involves applying a filter to the data received by the sensor array, resulting in a signal which is an accurate estimate of the signal-of-interest (SOI) impinging from the particular direction (Van Veen, Buckley, 1988, Wang, 2009, Benesty, Cohen, Chen, 2017). One way of doing so is to design a filter based solely on the knowledge of the direction-of-arrival (DOA) of the SOI, and sometimes also the DOAs of the interferences - this is called fixed beamforming (Benesty et al., 2017). A more robust way, additionally, involves utilizing the knowledge of the statistics of the data. Such a method is called adaptive beamforming (Wang, 2009, Benesty, Cohen, Chen, 2017, Benesty, Huang, 2013, Chandran, 2013). When the DOA of the SOI is known, and there is a limited effect of interference, a fixed beamformer is a very useful and efficient solution. As, in reality, such situations seldom exist, adaptive beamformers are a more sensible option. Over the years, a plethora of fixed and adaptive beamformers have been developed, out of which the delay-and-sum (DS) and minimum-variance-distortionless-response (MVDR) beamformers are well-appreciated, and are utilized in this work (Benesty et al., 2017).
As is apparent, the performance of both fixed and adaptive beamformers depend on the accuracy of the DOA estimate. An adaptive beamformer also depends on the accuracy of the estimated data-statistics. Therefore, there has always been a lot of focus on improving the performances of beamformers based on the better estimation of the steering vector and (or) quicker and more accurate tracking of the second-order statistics of the data (Reed, Mallett, Brennan, 1974, Asl, Mahloojifar, 2012, Zaharis, Yioultsis, 2011, Asl, Mahloojifar, 2010, Zhang, Liu, Leng, Wang, Shi, 2016, Feng, Liao, Xu, Zhu, Zeng, 2018, Landau, de Lamare, Haardt, 2014, Jia, Jin, Zhou, Yao, 2013, Gu, Leshem, 2012, Gu, Goodman, Hong, Li, 2014, Khabbazibasmenj, Vorobyov, Hassanien, 2012, Yang, Liao, Li, Lei, Wang, 2017, Ke, Zheng, Peng, Li, 2017, Yuan, Gan, 2017, Huang, Zhang, Xu, Ye, 2015, Liao, Guo, Huang, Li, So, 2017). Concurrently, recent technological advancements are driving the ever-evolving design of high-density sensor arrays, consisting of a large number of sensors, to obtain better performances (Weinstein et al., 2007). Such developments have brought the challenge of accurately estimating the second-order statistics from limited data, and processing such information efficiently. These challenges have led to innovative refinements of conventional adaptive algorithms used for efficient implementation of beamforming filters, the popular ones being the multi-stage wiener (MSW), reduced-rank linearly constrained minimum variance (RRLCMV), and their widely-linear variants (Wang, 2009, Honig, Goldstein, 2002, Burykh, Abed-Meraim, 2002, Santos, Zoltowski, 2004, De Lamare, Sampaio-Neto, 2007, de Lamare, Wang, Fa, 2010, Chevalier, Blin, 2007, Chevalier, Delmas, Oukaci, 2009, Song, Alokozai, de Lamare, Haardt, 2014, Zheng, Deleforge, Li, Kellermann, 2018).
It is worthwhile noting that this work is not another refinement of adaptive filtering algorithms. Rather, this work provides a theoretical framework to tackle the above-mentioned challenges at a higher level of abstraction, i.e., in the beamformer/filter design level. In this work, we propose a methodology to mathematically separate a large array of sensors into two smaller arrays using the Kronecker product (Van Loan, 2000, Schäcke). Henceforth, we propose the methodology to obtain new beamformers for the original array by combining the traditional beamformers of the smaller sub-arrays. The practical implementation of the proposed beamformers may be carried out using suitable adaptive algorithms, or new algorithms (based on RLS, LMS, etc.) may be developed that is more suitable for our proposed framework of beamforming. That is not the scope of this work, and may be dealt with separately. As such, this work must not be confused or compared with adaptive filtering algorithms such as the MSW, RRLCMV, etc.
Using the traditional MVDR and DS beamformers as examples, we show how to utilize the proposed theoretical framework to make beamforming for large arrays more efficient and effective. As will be illustrated in this work, the proposed variants of traditional frequency domain beamformers are more robust to unstable/erroneous estimates of the data-statistics and various levels of interferences. It is assumed that the DOA of the SOI is known. The interferences are broadband noise signals taken from the NOISEX-92 database (Varga and Steeneken, 1993), and the noise in the sensors are considered as Gaussian white noise signals. Inspired by our recent works (Cohen, Benesty, Chen, 2019, Yang, Huang, Benesty, Cohen, Chen, 2019) using the Kronecker product in differential (fixed) beamforming, this work experiments the utility of the same in adaptive beamforming. The framework also allows the combination of fixed and adaptive beamforming to generate hybrid beamformers. Uniform linear arrays (ULAs) (Wang, 2009, Benesty, Cohen, Chen, 2017, Benesty, Huang, 2013, Chandran, 2013) are used in our study.
At this juncture, we must also note that adaptive beamforming has generally been implemented on narrowband modulated signals used in communication technologies (Wang, 2009, Honig, Goldstein, 2002, Burykh, Abed-Meraim, 2002, Santos, Zoltowski, 2004, De Lamare, Sampaio-Neto, 2007, de Lamare, Wang, Fa, 2010, Chevalier, Blin, 2007, Chevalier, Delmas, Oukaci, 2009, Song, Alokozai, de Lamare, Haardt, 2014, Zheng, Deleforge, Li, Kellermann, 2018). Such signals are near-sinusoids and quasi-stationary in nature, and hence time-domain beamforming is suitable. However, in the case of non-stationary broadband signals like speech, frequency domain beamforming may be more relevant (Benesty, Cohen, Chen, 2018, Benesty, Chen, Huang, 2008, Ming Zhang, Er, 1995). Moreover, while traditionally microphone array beamforming has been confined to near-field applications (Brandstein, Ward, 2013, Ryan, Goubran, 1997, Thomas, Verburgh, Catrysse, Botteldooren, 2017), the developments in sensor technology now enable us to employ large microphone arrays in far-field applications, such as surveillance in crowded environments, trading in large open halls, and other distant speech and speaker recognition tasks (Kumatani, McDonough, Raj, 2012, Taghizadeh, Garner, Bourlard, 2012, AlShehhi, Hammadih, Zitouni, AlKindi, Ali, Weruaga). This work is devoted to beamforming for such evolving and futuristic applications.
The rest of this work is organized as follows: Section 2 formulates the beamforming problem, the various statistical and non-statistical metrics, and presents the traditional DS and MVDR beamformers. Section 3 describes the Kronecker product based beamforming methodology. Section 4 presents the experimental results, and compares the proposed beamformers with their traditional counterparts. Section 5 summarizes and concludes this work.
Section snippets
Signal model and conventional filters
We consider an arbitrary ULA of M sensors, as shown in Fig. 1, with the sensors located at arbitrary positions denoted by where δ is the inter-sensor distance. A discrete-time SOI, x(t), where t denotes the discrete-time index, impinges on the ULA as a plane-wave in the far-field, traveling at the velocity of sound, c, through the medium. Similarly, K independent interferences, impinge on the ULA. We consider the DOA of the SOI as θd, and the DOAs of
Kronecker product beamforming3
3 It is obvious from (20) and (21) that the performance of the MVDR beamformer depends significantly on the accuracy of the M-dimensional square covariance matrix, Φv(f, r) or Φy(f, r). As the number of sensors, M, increases, more data is required to reliably estimate such
Experimental performance
In this section, we evaluate the performances of the proposed beamformers in comparison with the conventional MVDR and DS beamformers. It has been observed that the KP-MVDR beamformer saturates rapidly with respect to the number of iterations, N. Hence, will be used in this work for the KP-MVDR beamformer. For our experiments, we utilize a speech signal as the SOI, and four noise signals, taken from the NOISEX-92 database (Varga and Steeneken, 1993), as interferences (of different variances
Conclusions
We have introduced a new approach to frequency domain adaptive beamforming for large sensor arrays, with the purpose of achieving enhanced robustness to interference and statistical instability. Firstly, the original ULA is represented by two smaller VULAs, which are connected by the Kronecker product. As the VULAs are smaller than the original ULA, adaptive beamformers can be derived from them using lesser data for statistical computations. Their smaller size also makes them robust to errors
CRediT authorship contribution statement
Rajib Sharma: Methodology, Software, Validation, Formal analysis, Investigation, Writing - original draft. Israel Cohen: Conceptualization, Methodology, Supervision, Writing - review & editing, Funding acquisition. Jacob Benesty: Conceptualization, Methodology, Supervision, Writing - review & editing.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgment
The authors thank Dr. Gongping Huang for his valuable inputs and suggestions.
References (52)
- et al.
Robust adaptive beamforming based on interference covariance matrix sparse reconstruction
Signal Process.
(2014) - et al.
Robust adaptive beamforming based on a new steering vector estimation algorithm
Signal Process.
(2013) - et al.
Adaptive reduced-rank lcmv beamforming algorithms based on joint iterative optimization of filters: design and analysis
Signal Process.
(2010) - et al.
Empirical mode decomposition for adaptive am-fm analysis of speech: a review
Speech Commun.
(2017) - et al.
Assessment for automatic speech recognition: II. noisex-92: a database and an experiment to study the effect of additive noise on speech recognition systems
Speech Commun.
(1993) - et al.
Robust adaptive beamforming via a novel subspace method for interference covariance matrix reconstruction
Signal Process.
(2017) - AlShehhi, A., Hammadih, M. L., Zitouni, M. S., AlKindi, S., Ali, N., Weruaga, L., 2017. Linear and circular microphone...
- et al.
Eigenspace-based minimum variance beamforming applied to medical ultrasound imaging
IEEE Trans. Ultrason. Ferroelectr. Freq. Control
(2010) - et al.
A low-complexity adaptive beamformer for ultrasound imaging using structured covariance matrix
IEEE Trans. Ultrason. Ferroelectr. Freq. Control
(2012) - et al.
Microphone Array Signal Processing
(2008)
Fundamentals of Signal Enhancement and Array Signal Processing
Fundamentals of Signal Enhancement and Array Signal Processing
Adaptive Signal Processing: Applications to Real-World Problems
Springer Handbook of Speech Processing
Microphone Arrays: Signal Processing Techniques and Applications
Reduced-rank adaptive filtering using Krylov subspace
EURASIP J. Appl. Signal Process.
Adaptive Antenna Arrays: Trends and Applications
Widely linear MVDR beamformers for the reception of an unknown signal corrupted by noncircular interferences
IEEE Trans. Signal Process.
Optimal widely linear MVDR beamforming for noncircular signals
Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
Differential Kronecker product beamforming
IEEE/ACM Trans. Audio Speech Lang. Process.
Reduced-rank adaptive filtering based on joint iterative optimization of adaptive filters
IEEE Signal Process. Lett.
Robust adaptive beamforming against large steering vector mismatch using multiple uncertainty sets
Signal Process.
Robust adaptive beamforming based on interference covariance matrix reconstruction and steering vector estimation
IEEE Trans. Signal Process.
Adaptive reduced-rank interference suppression based on the multistage wiener filter
IEEE Trans. Commun.
Robust adaptive beamforming with a novel interference-plus-noise covariance matrix reconstruction method.
IEEE Trans. Signal Process.
Cited by (6)
A comparison of robust capon beamformers using a large-scale microphone array for speech extraction
2023, Applied AcousticsCitation Excerpt :The first class takes full advantage of the characteristics of some regularly shaped arrays, decomposing their high dimensional direct path steering vectors into the Kronecker product (KP) of two low dimensional vectors [19–22], which is referred to as KP-based methods in this paper. This decomposition can help reduce the dimension of the beamformer weight vector, leading to improve the robustness of the MPDR beamformer and reduce the computational complexity simultaneously [23]. Although KP-based beamformers have been studied for speech extraction tasks in noisy and reverberant environments, only MVDR was integrated into KP-based beamformers, where the INCM is assumed to be estimated ideally, while the performance of KP-based MPDR beamformers in practical applications has not been evaluated.
A Comprehensive Review of Beamforming-Based Speech Enhancement Techniques, IoT, and Smart City Applications
2023, 2023 IEEE 2nd Industrial Electronics Society Annual On-Line Conference, ONCON 2023Adaptive beamforming based on new steepest descent algorithm
2022, Xi Tong Gong Cheng Yu Dian Zi Ji Shu/Systems Engineering and ElectronicsKronecker Product Adaptive Beamforming for Microphone Arrays
2021, 2021 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2021 - ProceedingsBeamforming with Cube Microphone Arrays Via Kronecker Product Decompositions
2021, IEEE/ACM Transactions on Audio Speech and Language ProcessingControlling Elevation and Azimuth Beamwidths with Concentric Circular Microphone Arrays
2021, IEEE/ACM Transactions on Audio Speech and Language Processing