Next Article in Journal
Distributed Channel Ranking Scheduling Function for Dense Industrial 6TiSCH Networks
Next Article in Special Issue
A Robust Fault-Tolerant Predictive Control for Discrete-Time Linear Systems Subject to Sensor and Actuator Faults
Previous Article in Journal
Real-Time Emotion Classification Using EEG Data Stream in E-Learning Contexts
Previous Article in Special Issue
Event-Triggering State and Fault Estimation for a Class of Nonlinear Systems Subject to Sensor Saturations
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Robust Steered Response Power Localization Method for Wireless Acoustic Sensor Networks in an Outdoor Environment

1
Key Laboratory of Noise and Vibration Research, Institute of Acoustics, Chinese Academy of Sciences, Beijing 100190, China
2
University of Chinese Academy of Sciences, Beijing 100049, China
*
Author to whom correspondence should be addressed.
Sensors 2021, 21(5), 1591; https://doi.org/10.3390/s21051591
Submission received: 19 January 2021 / Revised: 19 February 2021 / Accepted: 19 February 2021 / Published: 25 February 2021
(This article belongs to the Special Issue Sensor Fusion and Signal Processing)

Abstract

:
The localization of outdoor acoustic sources has attracted attention in wireless sensor networks. In this paper, the steered response power (SRP) localization of band-pass signal associated with steering time delay uncertainty and coarser spatial grids is considered. We propose a modified SRP-based source localization method for enhancing the localization robustness in outdoor scenarios. In particular, we derive a sufficient condition dependent on the generalized cross-correlation (GCC) waveform function for robust on-grid source localization and show that the SRP function with GCCs satisfying this condition can suppress the disturbances induced by the grid distance and the uncertain steering time delays. Then a GCC refinement procedure for band-pass GCCs is designed, which uses complex wavelet functions in multiple sub-bands to filter the GCCs and averages the envelopes of the filtered GCCs as the equivalent GCC to match the sufficient condition. Simulation results and field experiments demonstrate the excellent performance of the proposed method against the existing SRP-based methods.

1. Introduction

With the rapid development of communication technology and mobile computing devices, applications of wireless acoustic sensor networks (WASNs) are becoming popular in acoustic signal processing. Particularly, WASN-based sound source localization has captured researchers’ attention in the last two decades [1,2,3,4,5]. The existing methods available for passive source localization in WASNs include (1) the received energy-based approaches [6,7,8,9]; (2) the direction of arrival (DOA)-based approaches [10,11]; (3) the time of arrival (TOA)-based approaches [12]; (4) the time difference of arrival (TDOA)-based approaches [13,14,15] and (5) the steered response power (SRP)-based approaches [16,17,18,19,20,21,22].
Most methods require a pre-processing stage in which specific modalities are measured from sensor signals before the location-estimating stage. In contrast, the SRP-based approaches locate the source position or direction by maximizing the power of spatially steered filter and sum beamformer of a group of sensors and contain only one decision step in processing sensor signals to estimate location. Without information compression and disturbances resulting from partial mistakes in the front-end stage, the SRP-based solutions can usually yield more robust performance in noisy and reverberant acoustic environments. Practical implementations commonly use the generalized cross-correlation [23]-based form of the SRP function [16] to reduce computation. The methods similar to the GCC-expression of SRP function are also called a “global coherence field (GCF)” in several references [24,25].
In practice, the primary constraint of the SRP-based approaches is the time-consuming on-grid searching procedure for finding their global maximums. Hence, it has been a hot issue to reduce the computational cost for the SRP-based approaches. In [17], a stochastic region construction (SRC) method is proposed to avoid global grid searching. However, this strategy also causes information loss. In [26], a geometrically sampled grid set based on the TDOA gradient is proposed to improve the SRP performances. An alternative strategy to solve the high-cost searching problem is adopting some adaptive SRP functions regarding the grid resolution to apply a coarse or a hierarchical searching. In [27], the authors use the low-frequency component of GCC for coarse grid resolution and the high-frequency component for fine grids in the SRP-based DOA estimation. In [28], the authors adopt a Gaussian low-pass filter to the GCC for coarse grids. For full-band signals, a similar kind of modification is proposed both in microphone arrays [29] and WASNs [18,19], respectively, in which the spatial spectrum of a given grid is calculated from the sum of the phase-transform weighted GCCs (GCC-Phase Transform (PHAT)s) within a time window containing the TDOA values in the volume surrounding the grid, instead of the original GCC-PHAT in the SRP function.
The SRP-based approaches can provide a robust solution in DOA estimation and source localization tasks in confined spaces. However, they could lose their robustness in an outdoor WASN scenario due to the synthetic effect of the following factors. (1) Grid size, since the monitoring area in outdoor cases may become much more extensive than the area of indoor applications, and the proper searching grids would be much coarser (e.g., meter-level grids outdoors compared with centimeter-level grids indoors). (2) Steering time delay uncertainty; in the classical SRP-based localization frame, the steering time delay at a given position is generated from an ideal propagation model and is always assumed to be entirely right. However, the steering time delay to the source position is different from the actual propagation time. Such a difference becomes no more negligible in the outdoor environment and causes a defocus effect, even though the WASN system is well synchronized. (3) Signal passband; when processing the acoustic data collected in outdoor environments, high-pass or band-pass filtering is indispensable because the environmental noise is intense in the low-frequency range, and the source signals in the real world often possess the band-pass characteristic. The synthetic effect of these three factors would make it difficult to achieve stable localization results. The Modified-SRP functional (MSRP) method introduced in [18,19] provides an elegant solution for scalable grids but it is not suitable for band-pass signals. In [21], the authors elaborate on the SRP in band-pass situations and use the GCC-PHAT envelope or frequency-shifted GCC-PHAT to enhance the robustness in such situations. Nevertheless, the above two methods hardly consider the other two factors (the grid and the steering time uncertainty). In [30], the authors propose a Frequency-Sliding GCC (FSGCC) method, which uses singular value decomposition (SVD) or weighted SVD (WSVD) on the FSGCC matrix and can intelligently extract time delay information of the source signal from multiple sub-band GCCs. The authors adopt the WSVD-FSGCC to the MSRP functional for source localization. This solution can provide excellent localization performance in the band-pass situation with scalable grids. However, in outdoor applications, the high computation cost of the SVD of giant matrices is inevitable due to the long GCC range.
Previously, several common acoustic source placements have been proposed in outdoor scenarios. They mostly focus on localizing the source from TDOA [31] and DOA [32,33] measurements. Some uncertainties are then introduced by the estimation error of TDOA or DOA estimating algorithms. Moreover, some useful information is also compressed, which results in unstable performance. In this direction, in this paper, a robust SRP-based outdoor source localization problem is discussed.
In this paper, a modified SRP-based method is proposed, in which the systematic influence of the above inevitable factors in outdoor WASNs scenarios is considered. The localization performance is analyzed using the normalized contribution of the signal components in the SRP function. A sufficient condition dependent on the GCC waveform function for robust on-grid SRP-based source localization is derived by geometrical analysis. The SRP function using GCCs satisfying this condition can suppress the disturbances induced by the grid distance and the uncertain steering time delay. A GCC refinement procedure for band-pass GCCs is then designed, which uses the complex wavelet functions in multiple sub-bands to filter the GCC and averages the envelopes of the filtered GCCs as the equivalent GCC to match the sufficient condition. Simulation results and field experiments demonstrate the excellent performance of the proposed method against the existing SRP-based methods.
The rest of this paper is organized as follows. In Section 2, the outdoor SRP-based source localization problem is formulated. Section 3 gives the sufficient condition in brief and introduces the GCC refinement procedure. The results of the simulation and the field experiment are presented in Section 4. Conclusions are given in Section 5.

2. SRP-Based Localization in Outdoor Acoustic Sensor Network

2.1. System Models

We discuss the acoustic source localization problem in an N-dimensional Euclidean space with M distributed microphones ( M > N ). Let x R N be a spatial coordinate vector. Specifically, define x s as the source location and z m as the position of the m t h sensor ( m = 1 , 2 , , M ). Let s ( t ) be the source signal in the time domain, and the received signal of the microphone at z m can be modeled as
y m [ n ] = h m ( t ) s ( t ) + w m ( t ) δ ( t n / F s ) ,
where h m ( t ) is the impulse response function representing the propagation of sound from x s to z m , the operator “∗” represents the convolution operation, w m ( t ) stands for the additive noise signal, and δ ( t n / F s ) denotes the sampling process at rate F s . When the multi-path delay and non-linear distortion are neglected, the propagation function in the frequency domain can be simplified as
H m ( ω ) = A m e j ω t m ,
where A m R is the amplitude-attenuation factor and t m is the time delay factor. In the frequency domain Equation (1) can be denoted as
Y m ( Ω ) = A m S ( Ω ) e j Ω F s t m + W m ( Ω ) ,
where Ω = ω / F s [ π , π ] is the normalized angular frequency, Y m ( Ω ) is the discrete-time Fourier transform (DTFT) of y m [ n ] , S ( Ω ) and W m ( Ω ) are the Fourier transforms of s ( t ) and w m ( t ) , respectively.
Let η m ( x ) R be the steering time delay function describing the time delay associated with sound propagation from a given location x to z m . In practice, it is commonly modeled as the sound traveling time going through the line-of-sight (LOS) path with a constant sound speed v s ; i.e.,
η m ( x ) = | | x z m | | / v s ,
where “ . ” denotes the Euclidean distance. Note that η m ( x ) is not exactly the sound propagation in reality. Then the SRP function, which is defined as the output power of the filtered-and-sum beam-former, is given by:
P ( x ) = π π m = 1 M G m ( Ω ) Y m ( Ω ) e j Ω m F s η m ( x ) 2 d Ω ,
where G m ( Ω ) e j Ω m F s η m ( x ) is the filter associated with the m t h sensor. It can be equivalently expressed in term of GCCs [16]:
P x = 2 π l = 1 M m = 1 M R l , m η m ( x ) η l ( x ) ,
where
R l , m ( τ ) = 1 2 π π π Ψ l , m ( Ω ) Y l ( Ω ) Y m * ( Ω ) e j Ω F s τ d Ω
denotes the GCC of the sensor pair { l , m } , τ is the time lag, superscript “ ( . ) * ” represents the conjugate operation, Ψ l , m ( Ω ) = G l ( Ω ) G m * ( Ω ) and denotes the weight function of the associated GCC. Ideally, each R l , m ( τ ) achieves its peak at τ = t m t l so that the SRP function is supposed to achieve its maximum value at the source position x s , as shown in Figure 1a,b. The Phase Transform (PHAT) weight function
Ψ l , m P H A T ( Ω ) = 1 / Y l ( Ω ) Y m * ( Ω )
is widely used in the TDOA- and SRP-based localization applications. The PHAT-weighted GCC is generally referred to as the GCC-PHAT, and the SRP using the GCC-PHAT is generally referred to as the SRP-PHAT.
Removing those irrelevant and repetitive terms in Equation (6), the effective component for source localization can be simplified as
P E ( x ) = l = 1 M 1 m = l + 1 M R l , m η m ( x ) η l ( x ) = p = 1 C M 2 R p τ p ( x ) ,
where p is the sequence number of the valid sensor pair c p = { l , m } ( l < m ) and is deduced to be p = ( 2 M l ) ( l 1 ) / 2 + m l , varying from one to a combinatorial number C M 2 ; τ p ( x ) = η m ( x ) η l ( x ) and can be referred to as the steering TDOA function.

2.2. Problem Formulation

The classical SRP-based localization method often lacks robustness in outdoor scenarios. The steering time delay function η m ( x ) in the SRP function is different from the sound propagation in reality denoted as η m 0 ( x ) , and Δ η m ( x ) = η m ( x ) η m 0 ( x ) is denoted as the steering time-uncertainty function. Similarly, the steering TDOA-uncertainty functions in a pair of sensors can be expressed as
Δ τ p x   =   Δ η m x     Δ η l x   =   τ p x     τ p 0 x ,
where τ p 0 x   =   η m 0 x     η l 0 x , representing the real steering TDOA function for a given sensor pair c p . This term is usually negligible within a confined space, so it has been rarely discussed in classical SRP models. However, in outdoor applications, the sound propagation is much more unpredictable, resulting in enlarged uncertainty with the increase in distances. The steering time uncertainty can easily be influenced by the geography, temperature, wind, and self-localization error among sensors, and then yields a noticeable defocus effect on the SRP map, as shown in Figure 1c. The GCCs would intersect with each other dispersedly around x s .
Since the spatial spectrum generated by the SRP function contains many local extrema and ridged areas, the maximal value of P ( x ) is usually found through a grid-searching process. Consider a uniform sampling grid (USG) case in R N . Define X g as the set of grid points in the candidate searching region ( V R N ), and d g R , N g R as the grid distance and the total number of the grids in X g , respectively, then the estimated on-grid location is formulated as
x ^ s = arg max x X g P x = arg max x X g P E x .
Note that the localization precision depends on the gird resolution. A more accurate estimation usually requires a smaller d g . This will leads to a larger N g and significantly increased calculation burden because the number of grids is inversely proportional to the N t h power of d g (i.e., N g ( d g ) N ). Hence, the accuracy and feasibility can hardly be balanced in an outdoor WASN system confronting a large search region, for which the minimal grid resolution limited by computing power is much coarser than that in indoor applications. However, most SRP approaches usually work well at subtle grid resolutions, and coarser grid resolution has an undersampled effect, as shown in Figure 1d. The searching process probably would miss the source peak.
It is known that the background noise always dominates at low frequencies in the field environment, and real sound sources often show band-pass characteristics. Thus a band-pass GCC is indeed required. However, the SRP-PHAT with a band-pass source would cause a rippling effect [21], as shown in Figure 1e. The rippling effect does not alter the location of the maximal value of the SRP function. However, it may lead to local extrema and even fake peaks such that the SRP spectrum is susceptible to the two other factors and shows a lack of robustness.
Under the influence of the synthetic effect of the above inevitable factors, the real-world SRP output is illustrated in Figure 1f. It shows that classical SRP implementations hardly deal with all these factors outdoors and yield a divergent localization result.

3. A Robust Outdoor SRP-Based Source Localization Method

3.1. On-Grid SRP-Based Localization Error Bound Condition

It is known that the SRP-based spatial spectra mainly depend on the phase information of the source components. It is always reasonable to assume that the additive noise of sensors is independent of each other and the source signal, and then it has no spatial preference (which means that they have zero mean in the phase domain). Their contributions to the SRP spectrum can be neglected and not related to the grid resolution and the steering time uncertainty. Therefore, only the contribution of the source signal is considered in analyzing the SRP function. With the terms of additive noise w m ( τ ) neglected, the weight functions Ψ p Ω of the sensor pair c p usually can be expressed as
Ψ p ( Ω ) = B p Ψ 0 ( Ω ) ,
where B p R is an amplitude-scaling factor irrelevant to the frequency, and Ψ 0 Ω   =   Ψ 0 Ω R is a real function irrelevant to sensors. Substituting Equation (12) into Equation (7), the GCC R p ( τ ) can be rewritten as
R p ( τ ) = B p A l A m 2 π π π Ψ 0 ( Ω ) S ( Ω ) S * ( Ω ) e j Ω F s ( τ τ p 0 ( x s ) ) d Ω = B p A l A m C 0 2 π R 0 τ τ p 0 x s ,
where C 0 = max π π Ψ 0 Ω S Ω S * Ω j Ω F s τ d Ω , and
R 0 τ = 1 C 0 π π Ψ 0 Ω S Ω S * Ω e j Ω F s τ d Ω ,
is the amplitude-normalized version of the weighted self-correlation function of the source signal s ( t ) . Hence, each GCC contains the same waveform function R 0 τ with different time-shifting factors τ p 0 x s and amplitude factors B p A l A m / C 0 . In practice, the range information in amplitude is usually less stable or accurate than in time delay. Thus, a normalized mapping function representing the contribution of the source component in the SRP function can be constructed as
F E ( x , x s ) = 1 C M 2 p = 1 C M 2 R 0 ( τ p ( x ) τ p 0 ( x s ) ) .
In the above equation, the amplitude factors B p A l A m / C 0 between different sensor pairs are removed. Thus, each pair yields an equal contribution to the SRP function. Note that F E x 1 , 1 has a definite value range regardless of the sensor number M.
For a given grid distance d g R > 0 , an arbitrary uniform sampling grid set in R N can be expressed as
X ( d g , x g o ) =   x + x g o : x = [ n 1 d g , , n N d g ] T ; n 1 , , n N Z ,
where x g o R N is the position of the origin of the set. Then the on-grid location estimation is given by
x ^ s g =   arg max x X ( d g , x o ) F E x , x s =   arg max x X ( d g , x g o ) 1 C M 2 p = 1 C M 2 R 0 τ p ( x ) τ p ( x s ) + Δ τ p ( x ) .
It is worth pointing out that the grid resolution, the steering time uncertainty, and band-pass issues are comprehensively considered in the above-simplified SRP function.
The grid issue should be unrelated to the origin position x g o . In the real world, the uncertainty functions Δ τ p x are hard to closely describe due to many interference factors, and it is reasonable to assume that they have an upper bound Δ τ m a x (i.e., Δ τ p x Δ τ m a x ). Δ τ m a x indicates the steering time delay uncertainty level and can be estimated from the environmental and devices’ conditions. Thus, the robustness of the on-grid localization problem can be described as: given a d g and a Δ τ m a x , there exists a ε ( 0 , ) such that
x ^ s g x s ε .
Define a level-passed area based on F E x , x s :
M ( α , x s ) { x : F E ( x , x s ) α } R N ,
where α R is the level-pass threshold. Then a sufficient condition can be obtained in the following Proposition:
Proposition 1.
if M ( α , x s ) X ( d g , x g o ) and M ( α , x s ) is a bounded set (i.e., there exists a ε M ( 0 , ) such that x 1 x 2 ε M for all x 1 , x 2 M ( α , x s ) ), then Inequality (18) is satisfied.
The proof is given in Appendix A.1. Thus, the robustness of the on-grid source localization problem can be analyzed in terms of M α , x s .
A practical example of M α , x s is depicted in Figure 2, and its area shrinks inwards when α increases. The first sub-condition ( M ( α , x s ) X ( d g , x g o ) ) can be satisfied when M α , x s covers enough areas. The shape of M ( α , x s ) relates to α , R 0 τ , Δ τ p ( x ) , and sensor distribution, and it is generally irregular. Consider a closed ball B N ( x 0 , r ) x : | x x 0 | r ; x 0 , x R N with center x 0 and radius r. If
r d g N / 2 ,
then B N ( x 0 , r ) X ( d g , x g o ) is satisfied. The proof can be seen in Appendix A.2. Consequently, if B N ( x s , d g N / 2 ) M ( α , x s ) , then the first sub-condition is satisfied.
Figure 3 illustrates a typical waveform of R 0 τ , the GCC-PHAT of the passband Ω C Ω B , Ω C + Ω B     0 , π , which can be expressed by
R 0 P H A T B P ( τ ) = sinc Ω B F s π τ cos Ω C F s τ .
A valid R 0 τ is an even and bounded function (i.e., R 0 τ   =   R 0 τ and R 0 τ     1 , 1 ) and contains a main-lobe around τ = 0 , where its maximum a m lies. The maximum side-lobe height (or the maximum value outside the main-lobe area if R 0 ( τ ) has no side-lobes) can be denoted as a s , where a s < a m .
Let us define a function based on R 0 τ by
T R ( a T ) inf { | τ | : R 0 ( τ ) < a T } ,
where a T a S , a M is the level-pass threshold of GCC, “ inf { . } ” represents the infimum. T R ( a T ) represents the half-width of the level-passed section of R 0 τ within its main-lobe. It follows that R 0 τ     a T if and only if τ     T R a T , T R a T .
Based on a geometrical analysis in Appendix A.3, if R 0 τ possesses the following property:
T R ( α ) d g N / v s + Δ τ m a x ,
then M ( α , x s ) B N ( x s , d g N / 2 ) . Therefore, the first sub-condition can be satisfied.
For all α such that α > max x + { F E ( x , x s ) } , the second sub-condition ( M ( α , x s ) is a bounded set) is satisfied. The area of M ( α , x s ) is mainly the superposition of the projection area of the main-lobe sections of GCCs belonging to individual sensor pairs. Denote
Λ p ( τ c , T ) = { x : | τ p ( x ) τ c | T }
to be the projection area of the TDOA section τ c T , τ c + T of sensor pair c p , where T [ 0 , ) and τ c   τ p m a x , τ p m a x are the half-width and the central TDOA of the section, respectively, and τ p m a x = z l z m / v s is the maximal TDOA value that this sensor pair can produce.
For each sensor pair c p , the solution set of the half hyperbolic equation τ p x = τ c can be denoted as Λ p τ c , 0 and extends to infinity (i.e., there exists an x such that x = and x Λ p τ c , 0 ). For two different sensor pairs c i and c j , if there exist a τ i c τ i m a x , τ i m a x and a τ j c τ j m a x , τ j m a x such that Λ i τ i c , 0 Λ j τ j c , 0 or Λ i τ i c , 0 Λ j τ j c , 0 , then the half hyperbolic functions τ i ( x ) = τ i c and τ j ( x ) = τ j c are not independent. The sense might occur when the sensors of these two pairs are co-linear or have the same axis of symmetry; in the meantime, both τ i c and τ j c reach their extremum or become zero. In WASNs, this case rarely happens because the sensor distributions are often irregular. Despite this sense for all sensor pairs, the maximal value of F E x , x s at infinity does not exceed a linear combination of a m and a s , which is given as
α i n f = C N 2 a m + C M 2 C N 2 a s C M 2 .
The detailed derivation can be found in Appendix A.4. If α > α i n f , then M ( α , x s ) is bounded.
Combining Inequality (23) and Equation (24) together, a sufficient condition for robust on-grid source localization is given by
T R α i n f   > d g N / v s + Δ τ m a x .
It means that for a given grid distance d g and steering TDOA uncertainties within Δ τ m a x , if the GCC waveform function R 0 τ has a wide main-lobe satisfying this condition, then the divergent on-grid location estimation can be avoided.
The SRP-PHAT generates a sharp GCC to increase the TDOA resolution for cases with reverberation or multiple sources. However, as shown in Figure 3, the band-pass effect would bring a narrow main-lobe section and strong side-lobes to the GCC waveform function. It can hardly satisfy the requirement Inequality(25), which is also shown by the poor performance of SRP-PHAT in Figure 1f. Next, we will introduce a GCC waveform refinement procedure for the band-pass SRP.

3.2. Robust SRP-Based Source Localization with Refined GCC Waveform

The condition in Inequality (25) is too strict for band-pass GCC situations with coarse grid resolution and perceptible steering TDOA uncertainties. Some classical GCC methods utilized low-pass filtering to meet a broader main-lobe requirement, but they are not applicable for band-pass signals. In this section, the GCC is refined to obtain a suitable waveform to modify the SRP function.
Consider a complex wavelet function ψ e ( τ , Ω C ) = u e ( τ ) e j Ω C F s τ , where u e ( τ ) L 2 ( R ) is an even symmetrical function. Applying ψ e ( τ , Ω C ) as the filtering function on the GCC-PHAT, the filtered output of c p can be denoted as
R p C F ( τ , Ω C ) = R p P H A T ( τ ) ψ e ( τ , Ω C ) ,
where R p P H A T ( τ ) is the GCC-PHAT of c p .
When the real function u e ( τ ) has an effective support [ Ω B , Ω B ] [ π , π ] in the frequency domain, i.e.,
| U e ( Ω ) | 2 d Ω Ω B Ω B | U e ( Ω ) | 2 d Ω Ω B Ω B | U e ( Ω ) | 2 d Ω ,
where U e ( Ω ) is the Fourier Transform of u e ( τ ) , and if the source is dominant in the frequency band [ Ω C Ω B , Ω C + Ω B ] ( 0 , π ] , then the approximation
R p C F ( τ , Ω C ) = 1 2 π Y p l ( Ω ) Y p m * ( Ω ) | Y p l ( Ω ) Y p m * ( Ω ) | U e ( Ω Ω C ) e j Ω F s τ d Ω 1 2 π Ω C Ω B Ω C + Ω B Y p l ( Ω ) Y p m * ( Ω ) | Y p l ( Ω ) Y p m * ( Ω ) | U e ( Ω Ω C ) e j Ω F s τ d Ω 1 2 π Ω C Ω B Ω C + Ω B e j Ω F s τ p 0 ( x s ) U e ( Ω Ω C ) e j Ω F s τ d Ω 1 2 π e j Ω F s τ p 0 ( x s ) U e ( Ω Ω C ) e j Ω F s τ d Ω = u e ( τ τ p 0 ( x s ) ) e j Ω C F s ( τ τ p 0 ( x s ) )
exists. It can be observed that the approximate function carries the same envelope as u e ( τ ) and extracts the TDOA information in [ Ω C Ω B , Ω C + Ω B ] .
Note that the R p C F ( τ , Ω C ) is equal to the time domain approach of the sub-band GCC defined in [30]. Since the main goal is to obtain an equivalent GCC to match the sufficient condition in Inequality (25), a lightweight approach is to average the envelope of those filtered GCCs of multiple sub-bands in high SNR conditions. According to the power spectral density (PSD) of source signal or other prior knowledge, N q valid sub-bands can be selected with individual central frequency Ω q . The final refined GCC is given by
R p W R ( τ ) = 1 N q q | R p C F ( τ , Ω q ) | | u e ( τ τ p 0 ( x s ) ) | ,
which has a specific waveform function R 0 ( τ ) | u e ( τ ) | . Furthermore, the improved spatial function is calculated as
P W R ( x ) = 1 C M 2 p = 1 C M 2 R p W R ( τ p ( x ) ) = 1 C M 2 N q p = 1 C M 2 q = 1 N q | R p C F ( τ p ( x ) , Ω q ) | .
The selection u e ( τ ) has a significant influence on the refinement of GCC. Its envelope | u e τ | provides the waveform function of refined GCCs. The suitable envelope of a suitable u e τ should have no side-lobes, i.e., u e τ 1 > u e τ 2 0 for all τ 1 < τ 2 . Meanwhile, each U e Ω Ω q in the frequency domain serves as a band-pass filter, thus the spectral distribution of U e Ω should be concentrated to satisfy Inequality (27). Gaussian function given by
u e ( τ ) = e ( Ω d F s τ ) 2
which possesses the required properties both in the time domain and in the frequency domain. Then the corresponding complex filtering function ψ e τ , Ω C can be regarded as a complex Morlet wavelet. According to (25), for a given grid distance d g and steering TDOA uncertainty level Δ τ m a x , the parameter Ω d can be given by
Ω d = v s ln α / F s d g N + v s Δ τ m a x ,
where N is the space dimension, α is the threshold value, which usually can be set as α = 0.5 . Taking Equation (31) into Inequality (27) and dividing (27) by its right side term, it yields
e Ω 2 Ω d 4 d Ω Ω B Ω B e Ω 2 Ω d 4 d Ω / Ω B Ω B e Ω 2 Ω d 4 d Ω 1 .
Thus, the relation of Ω d and Ω B can be obtained by the following equivalent equation:
2 Ω d 0 e Ω 4 d Ω 0 Ω B e Ω 4 d Ω / 0 Ω B e Ω 4 d Ω = c ,
where c is an extremely small number. Then, it can be obtained that
Ω B = 2 c e Ω d ,
where c e is the positive solution of the following equation:
x E 3 4 x 4 = 4 c 1 + c Γ 5 4 ,
where E n ( x ) = 1 + e x t t n d t , ( x > 0 ) and Γ ( x ) = 0 + t x 1 e 1 d t , ( x > 0 ) . When c is set as 0.001(−30 dB), c e in Equation (32) can be obtained as 2.89.
A simulation is performed to illustrate the effect of the GCC waveform refinement procedure on on-grid SRP-based source localization. As shown in Figure 4, the dot-dashed box shows the range of TDOA within the volume of the nearest gird x g , the dashed line with “ Δ ” shows the real TDOA, which should coincide with the peak of the GCC; the dotted line with “∇” marks R p τ p x g , corresponding to the nearest gird x g . The R p τ p x g of the traditional GCC-PHAT is small, thus leading to poor performance in grid searching. In contrast, the proposed refining method generates a smooth waveform and high values throughout the TDOA region indicated by the box in the figure.
The modified algorithm with the GCC refinement procedure is shown in Algorithm 1, in which u e ( τ ) = e ( Ω d F s τ ) 2 is taken as the target waveform function.
Algorithm 1: SRP with the waveform refinement procedure
Parameter Setting
(1) Set the maximum steering TDOA error Δ τ m a x = Δ τ m a x C + Δ τ m a x S , where the sub-items Δ τ m a x C and Δ τ m a x S are determined by the wind and the synchronization error of sensors, respectively.
(2) Set the grid distance d g and searching region V that meet the system requirement. Then the searching grid set X g is generated.
(3) Set the waveform function u e ( τ ) = e ( Ω d F s τ ) 2 and α = 0.5.
(4) Set c = 0.001 and compute the bandwidth Ω B using Equation (32).
Band selecting
(1) Set up the passband Ω L , Ω U
(2) Pick up N q highest PSD bands of the source or divide the passband uniformly.
Source Localization
(1) Calculate the refinement waveform (WR)-SRP function P W R ( x ) by Equation (29) at all x X g .
(2) Estimate the source location x ^ s by Equation (11).

4. Experiment Results and Discussion

4.1. Numerical Simulations

In this section, we use Monte Carlo simulations to analyze the efficiency of the proposed SRP-based localization method (the SRP functional with the refinement waveform, referred to as WR), compared with the traditional SRP functional with GCC-PHAT (PS), the SRP functional—the envelope of GCC-PHAT (PES) that is designed for acoustic band-pass signals [21], the modified-SRP (M-SRP) functional with GCC-PHAT (PM) [18] in which grid resolution is considered, and the M-SRP functional with the envelope of GCC-PHAT (PEM) in which both band-pass and grid resolution are considered.
In this setup, M = 8 sensors and one source are randomly deployed in a monitored area of 200 m by 200 m. The propagation model is set to be the line-of-sight path with a constant sound speed of 345 m/s. The input GCCs are generated by the waveform function in Equation (21) with passband of 0.15 π , 0.4 π . The steering TDOA uncertainty Δ τ p ( x ) uniformly distributes over Δ τ m a x , Δ τ m a x , where Δ τ m a x is the maximal time uncertainty dependent on the sound-propagation model error and the synchronization error.
We consider four different conditions in WASNs to test the algorithms: (a) a small steering TDOA uncertainty and small grid distance (STSG) condition with Δ τ m a x = 0.1 ms, d g = 0.1 m, (b) a large steering TDOA uncertainty and small grid distance (LTSG) condition with Δ τ m a x = 100 ms, d g = 0.1 m, (c) a small steering TDOA uncertainty and large grid distance (STLG) condition with Δ τ m a x = 0.1 ms, d g = 10 m, (d) a large steering TDOA uncertainty and large grid distance (LTLG) condition with Δ τ m a x = 100 ms or d g = 10 m.
The mean absolute error (MAE) E x ^ s x s of distance and the cumulative distribution function (CDF) of estimation errors of relative distance are calculated to evaluate the accuracy and robustness of these algorithms, where the relative distance in the cumulative distribution function (CDF) is normalized by the grid distance, i.e.,
F ( e u ) = P x ^ s x s / d g e u ,
where e u is the relative positioning error that is determined as the system requirement. Specifically, the 95th percentile of the localization error in meters is computed as F 1 ( 0.95 ) · d g .
The MAE and 95th percentile results are listed in Table 1. All the localization algorithms can obtain the best estimation accuracy in the STSD condition in which the defocus effect and undersampled effect are slight. When the steering TDOA uncertainty or the grid distance increases, the MAE would increase. However, compared with the PS, PES, PM, and PEM methods, the MAE in the WR has almost the smallest estimate error because all these factors have been considered. The 95th percentile has similar results with the MAE, which indicates that the proposed WR method has a stable localization performance in outdoor conditions.
Figure 5a–d depict the CDF of each algorithm in the range e u [ 0.5 , 100 m / d g ] under the four conditions. Specifically, the CDF curves will increase rapidly with the location error in the fine condition, and then the estimate errors are the smallest for all the algorithms in the STSG. The CDF curve will move down as the grid distance d g and steering TDOA uncertainty Δ τ m a x increase, such as in the LTSG, STLG, and LTLG. Since the steering TDOA uncertainty is not considered in PES and PEM, their descent range of CDF in the SDLG is lower than that in the LDSG. Among these localization algorithms, the CDF of the WR is the highest or very close to the highest (STLG), and the PEM method is better than the PS, PES, and PM. The proposed WR method is very robust even though the condition becomes abominable.
Furthermore, Figure 6 presents the MAE in four situations: (a) fixed small steering TDOA uncertainty (ST) with Δ τ m a x = 0.1 ms, d g ranges from 0.1 m to 50 m; (b) fixed large steering TDOA uncertainty level (LT) with Δ τ m a x = 100 ms, d g ranges from 0.1 m to 50 m; (c) fixed small grid distance (SG) with d g = 0.1 m, Δ τ m a x range from 0.1 ms to 100 ms; (d) fixed large grid distance (LG) with d g = 10 m, Δ τ m a x range from 0.1 ms to 100 ms. The MAE increases with d g or Δ τ m a x significantly, and this indicates that the steering TDOA uncertainty and grid distance have a severe influence on the performance of source localization. In each situation, the PS and PM produce larger MAE than the other algorithms when d g and Δ τ m a x are small because they are not applied to band-pass signals. Since the scalable grid sampling and steering TDOA uncertainty are not considered in the PES, it shows reliable performance only when d g 1 m and Δ τ m a x 1 ms. The PEM considered both grid size and band-pass effect; thus, it achieves the best performance in the small Δ τ m a x case. However, the MAE becomes worse when the influence caused by the steering TDOA uncertainties is more significant than by the grid size. The WR obtains the MAE close to the PEM when Δ τ m a x is small. Moreover, it is the smallest in all the other situations. These results abundantly demonstrate its excellent robust performance.

4.2. Field Experiment

In this experiment, seven nodes are distributed in a park, as shown in Figure 7a,b. Each node consists of a microphone sensor, a Wi-Fi module, and a GPS module for self-localization and time calibration. The monitoring area has the same 200 m × 200 m in addition with a hillock. A portable speaker generates the sound signals at 12 positions inside the area, such as the Gaussian signal (S-G), the whistle of vehicles (S-V) representing an urban source, and birdsong (S-B) representing a field source. The temperature was approximately 30 °C, and the wind speed is slower than 3 m/s. Therefore, in the proposed method Δ τ m a x can be set to be 10 ms fully considering the self-localization error of the sensors and the effect of wind.
The sampling frequency is 10,000 Hz and Figure 7c shows the PSDs of both the background noise and received source signals, which are obtained with the Burg method of 50 order number and 2048 FFT length. The PSDs of the source signals are collected at about 30 m away from the speaker. Because the environmental noise is mainly distributed in the frequency bands below 1500 Hz, the passband is set to be (1500 Hz, 3500 Hz) for all sources. The estimated SNRs are shown in Figure 7d, and the SNRs of the full band (0, 5000 Hz) and of the passband (1500 Hz, 3500 Hz) are plotted in solid lines and dashed lines, respectively. For the three source types, the SNR is improved by 20 dB∼30 dB.
The recorded data are divided into 1242 two-second audio frames. SRP algorithms with full-band and band-pass cross-correlation (referred to as CSF and CSB) are added to analyze the necessity of band-pass signals. The PS and PM are not included since they have been proven unreliable in the simulation. Then the candidate SRP-based locators compared in this sub-section include: (1) SRP with full-band GCC (CSF), (2) SRP with band-pass GCC (CSB), (3) SRP with the envelope of band-pass GCC-PHAT (PES), (4) MSRP with the envelope of band-pass GCC-PHAT (PEM) and (5) WR-SRP with band-pass GCC (WR). A well known TDOA-based localization method [13] (referred to as TC) is also compared as a reference in which the TDOAs are obtained by band-pass GCC-PHATs.
The MAE and the 95th percentile of the localization errors of the TC method and the SRP-based methods with different grid distances ( d g { 0.1 , 1 , 10 } m) are listed in Table 2. Moreover the MAEs with grid distance d g ranging from 0.1 m to 50 m are presented in Figure 8a. Figure 8b–d give the CDF curves at the three grid distances ( d g { 0.1 , 1 , 10 } m).
Like the simulation, the MAEs increase and the CDF curves move down as the grid distance increases. The MAE of the TC method is the highest because some sensor pairs might produce very severe TDOA measurements in noisy acoustic environments. Its CDF curve also shows that the solution is not stable. By comparing the result of CSF and CSB, the band-pass GCC can significantly enhance the SNR and the localization performance. The PES and PEM obtain more significant localization errors and lack robustness, which indicates the influence of the steering TDOA uncertainty is very remarkable. The proposed WR method achieves the best estimation for all the grid distances, which thoroughly verifies its effectiveness.

5. Conclusions

In this work, a novel and robust Steered Response Power (SRP)-based source localization approach is proposed to localize the band-pass source in outdoor WASNs with steering time delay uncertainty and coarser spatial grids. The robustness of on-grid source localization is analyzed by a sufficient condition, in which the relation between GCC signal waveform and on-grid localization error is demonstrated. A band-pass GCC refinement procedure is designed to meet the sufficient condition for enhancing the on-grid source localization performance. The Monte Carlo simulation and field experiment show that the proposed method has a robust performance in outdoor WASNs scenarios, compared with some state-of-the-art SRP-based methods.

Author Contributions

Conceptualization, methodology, programming, writing—original draft preparation, Y.H.; conceptualization, writing—review and editing, data curation, J.T.; writing—review and editing, X.H.; supervision, M.B. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China under Grant (11774379,61501448), and Youth Innovation Promotion Association.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Publicly available datasets were analyzed in this study. This data can be found here: https://1drv.ms/u/s!AskSoQGpB3VUgfIqsxtYhosVrGyzOg?e=pnfutC.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
SRPsteered response power
TDOAtime difference of arrival
DOAdirection of arrival
GCCgeneralized cross-correlation
PHATphase transform
CDFcumulative distribution function
GPSGlobal Position System
FFTFast Fourier Transform

Appendix A

Appendix A.1

Proposition A1.
if M ( α , x s ) X ( d g , x g o ) and M ( α , x s ) is a bounded set (i.e., there exists a ε M ( 0 , ) such that x 1 x 2 ε M for all x 1 , x 2 M ( α , x s ) ), then Inequality (18) will be satisfied.
Proof of Proposition A1.
For an arbitrary x g o , if M ( α , x s ) X ( d g , x g o ) , there exists an x a such that x a M ( α , x s ) X ( d g , x g o ) . Let x ^ s g be the estimated result from Equation (17). Then F E ( x ^ s g , x s ) F E ( x a , x s ) α . According to the definition of M ( α , x s ) , x ^ s g M ( α , x s ) holds. Since M ( α , x s ) is a bounded set, x a < . Then x s x a is finite. Denote ε M ( 0 , ) be a bound of M ( α , x s ) and let ε = ε M + x s x a ( 0 , ) . Then x ^ s g x s x ^ s g x a + x a x s ε .

Appendix A.2

Proposition A2.
If a closed ball B N ( x o , r ) such that r d g N / 2 , then for all x g o R N , B N ( x o , r ) X ( d g , x g o ) holds.
Proof of Proposition A2.
Let B N ( x o , r ) be a closed ball with center x o and radius r. For an arbitrary x g o R N , the vector from x o to x g o is denoted as Δ x o = x o x g o = Δ x 1 o , , Δ x N o T . Given d g R + , it deduces n k o = Δ x k o d g (k = 1,...,N), where “ . ” means the nearest integer. Therefore, we can find the grid point x g n = x g o + n 1 o d g , , n N o d g T X ( d g , x g o ) , so that x o x g n = Δ x 1 o n 1 o d g , , Δ x N o n N o d g T . The distance yields
x g n x o i = 1 N d g 2 2 = N d g 2 .
Thus, if r N d g / 2 , then x g n B N ( x o , r ) . Hence, X ( d g , x g o ) B N ( x o , r ) holds. □

Appendix A.3

Proposition A3.
If the waveform function R 0 ( τ ) such that T R ( α ) 2 r / v s + Δ τ m a x , then B N ( x s , r ) M ( α , x s ) .
Proof of Proposition A3.
Based on Equation (4), it derives that
| τ p ( x ) τ p ( x s ) | =   | η m ( x ) η l ( x ) η m ( x s ) + η l ( x s ) |   | η m ( x ) η m ( x s ) | + | η l ( x s ) η l ( x ) | =   | x z m x s z m | + | x z l x s z l | v s   2 x x s / v s
Given the steering TDOA uncertainty level Δ τ m a x , for each x B N x s , r , the steering TDOA function τ p ( x ) derives that
| τ p ( x ) τ p 0 ( x s ) | =   | τ p ( x ) τ p ( x s ) + Δ τ p ( x s ) |   | τ p ( x ) τ p ( x s ) | + | Δ τ p ( x s ) |   2 x x s / v s + Δ τ m a x   2 r / v s + Δ τ m a x .
Since T R ( α ) 2 r / v s + Δ τ m a x , according Equation (22), it derives that
R p ( τ p ( x ) ) = R 0 ( τ p ( x ) τ p 0 ( x s ) ) α
holds for all c p . According to Equation (15), then for every x B N x s , r , the inequality
F E x , x s     α
holds. According to Equation (19), B N x s , r M α , x s holds. □

Appendix A.4

Proposition A4.
If for all two different pairs of sensors c i = { i l , i m } , c j = { j l , j m } in the WASNs satisfy that τ i c [ z i l z i m , z i l z i m ] / v s and τ j c [ z j l z j m , z j l z j m ] / v s , Λ i τ i c , 0     Λ j τ j c , 0 and Λ i τ i c , 0     Λ j τ j c , 0 , then
max { x = + , x s < + } F E ( x , x s )     C N 2 a m + ( C M 2 C N 2 ) a s C M 2
holds.
Proof of Proposition A4.
For a spatial point x such that x = , let K N be the total number of sensor pairs c p such that x Λ p τ p 0 x s , T R a s . According to Equation (15) and Inequality (22), it follows that
F E ( x , x s ) K a m + ( C M K ) a s C M 2 .
If K C N 2 + 1 , there exists a collection of N linear independent sensor pairs from those C N 2 + 1 sensor pairs. Without the loss of generality, denote this collection as { c 1 , , c N } . Then for each x d p = 1 N Λ p τ p 0 x s , T R a s , there exists an equation set such that:
τ 1 ( x d ) = τ 1 c , τ 2 ( x d ) = τ 2 c , τ N ( x d ) = τ N c ,
where τ N c τ p 0 x s T R a s , τ p 0 x s + T R a s . According to the condition of the Proposition A4 and since the sensor pairs are all linear independent, these N equations are linear independent. Then it holds that x d     which is in contradiction with x   =   . Thus K C N 2 . According to Inequality (A1), it is easily obtain that F E ( x , x s ) ( C N 2 a m + ( C M 2 C N 2 ) a s ) / C M 2 . □

References

  1. Ajdler, T.; Kozintsev, I.; Lienhart, R.; Vetterli, M. Acoustic Source Localization in Distributed Sensor Networks. In Proceedings of the Conference Record of the Thirty-Eighth Asilomar Conference on Signals, Systems and Computers, 2004, Pacific Grove, CA, USA, 7–10 November 2004; Volume 2, pp. 1328–1332. [Google Scholar]
  2. Liu, Y.; Hu, Y.H.; Pan, Q. Robust Maximum Likelihood Acoustic Source Localization in Wireless Sensor Networks. In Proceedings of the GLOBECOM 2009-2009 IEEE Global Telecommunications Conference, Honolulu, HI, USA, 30 November–4 December 2009; pp. 1–6. [Google Scholar]
  3. Saric, Z.; Kukolj, D.; Teslic, N. Acoustic Source Localization in Wireless Sensor Network. Circuits Syst. Signal Process. 2010, 29, 837–856. [Google Scholar] [CrossRef]
  4. Kim, Y.; Ahn, J.; Cha, H. Locating acoustic events based on large-scale sensor networks. Sensors 2009, 9, 9925–9944. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  5. Cobos, M.; Antonacci, F.; Alexandridis, A.; Mouchtaris, A.; Lee, B. A survey of sound source localization methods in wireless acoustic sensor networks. Wirel. Commun. Mob. Comput. 2017, 2017. [Google Scholar] [CrossRef]
  6. Sheng, X.; Hu, Y.H. Sequential acoustic energy based source localization using particle filter in a distributed sensor network. In Proceedings of the 2004 IEEE International Conference on Acoustics, Speech and Signal Processing, Montreal, QC, Canada, 17–21 May 2004; Volume 3, pp. 972–975. [Google Scholar]
  7. Sheng, X.; Hu, Y.H. Maximum likelihood multiple-source localization using acoustic energy measurements with wireless sensor networks. Signal Process. IEEE Trans. 2005, 53, 44–53. [Google Scholar] [CrossRef] [Green Version]
  8. Meng, W.; Xiao, W. Energy-based acoustic source localization methods: A survey. Sensors 2017, 17, 376. [Google Scholar] [CrossRef] [Green Version]
  9. Chang, S.; Li, Y.; He, Y.; Wu, Y. RSS-based target localization in underwater acoustic sensor networks via convex relaxation. Sensors 2019, 19, 2323. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  10. Berman, Z. A reliable maximum likelihood algorithm for bearing-only target motion analysis. In Proceedings of the 36th IEEE Conference on Decision and Control, San Diego, CA, USA, 12 December 1997; Volume 5, pp. 5012–5017. [Google Scholar]
  11. Doğançay, K. Bearings-only target localization using total least squares. Signal Process. 2005, 85, 1695–1710. [Google Scholar] [CrossRef]
  12. Navidi, W.; Murphy, W.; Hereman, W. Statistical Methods in Surveying by Trilateration. Comput. Stat. Data Anal. 1998, 27, 209–227. [Google Scholar] [CrossRef]
  13. Chan, Y.; Ho, K. A Simple and Efficient Estimator for Hyperbolic Location. Signal Process. IEEE Trans. 1994, 42, 1905–1915. [Google Scholar] [CrossRef] [Green Version]
  14. Gillette, M.; Silverman, H. A Linear Closed-Form Algorithm for Source Localization From Time-Differences of Arrival. Signal Process. Lett. IEEE 2008, 15, 1–4. [Google Scholar] [CrossRef]
  15. Bordoy, J.; Schott, D.J.; Xie, J.; Bannoura, A.; Klein, P.; Striet, L.; Hoeflinger, F.; Haering, I.; Reindl, L.; Schindelhauer, C. Acoustic Indoor Localization Augmentation by Self-Calibration and Machine Learning. Sensors 2020, 20, 1177. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  16. DiBiase, J.H.; Silverman, H.F.; Brandstein, M.S. Robust localization in reverberant rooms. In Microphone Arrays; Springer: Berlin/Heidelberg, Germany, 2001; pp. 157–180. [Google Scholar]
  17. Do, H.; Silverman, H.; Yu, Y. A Real-Time SRP-PHAT Source Location Implementation using Stochastic Region Contraction(SRC) on a Large-Aperture Microphone Array. In Proceedings of the 2007 IEEE International Conference on Acoustics, Speech and Signal Processing—ICASSP ’07, Honolulu, HI, USA, 15–20 April 2007; Volume 1, pp. 121–124. [Google Scholar]
  18. Cobos, M.; Marti, A.; Lopez, J.J. A Modified SRP-PHAT Functional for Robust Real-Time Sound Source Localization With Scalable Spatial Sampling. IEEE Signal Process. Lett. 2011, 18, 71–74. [Google Scholar] [CrossRef]
  19. Marti, A.; Cobos, M.; Lopez, J.; Escolano, J. A steered response power iterative method for high-accuracy acoustic source localization. J. Acoust. Soc. Am. 2013, 134, 2627–2630. [Google Scholar] [CrossRef]
  20. Traa, J.; Wingate, D.; Stein, N.; Smaragdis, P. Robust Source Localization and Enhancement With a Probabilistic Steered Response Power Model. IEEE/ACM Trans. Audio Speech Lang. Process. 2015, 24, 1. [Google Scholar] [CrossRef]
  21. Cobos, M.; Garcia-Pineda, M.; Arevalillo-Herráez, M. Steered Response Power Localization of Acoustic Pass-Band Signals. IEEE Signal Process. Lett. 2017, 24, 717–721. [Google Scholar] [CrossRef]
  22. Ritu; Dhull, S. Iterative Volumetric Reduction (IVR) Steered Response Power Method for Acoustic Source Localization. Int. J. Sens. Wirel. Commun. Control 2020, 10. [Google Scholar] [CrossRef]
  23. Knapp, C.; Carter, G. The Generalized Correlation Method for Estimation of Time Delay. Acoust. Speech Signal Process. IEEE Trans. 1976, 24, 320–327. [Google Scholar] [CrossRef] [Green Version]
  24. Brutti, A.; Omologo, M.; Svaizer, P. Speaker localization based on oriented global coherence field. In Proceedings of the Ninth International Conference on Spoken Language Processing, Pittsburgh, PA, USA, 17–21 September 2006. [Google Scholar]
  25. Brutti, A.; Omologo, M.; Svaizer, P. Localization of multiple speakers based on a two step acoustic map analysis. In Proceedings of the 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, Las Vegas, NV, USA, 30 March–4 April 2008; pp. 4349–4352. [Google Scholar]
  26. Salvati, D.; Drioli, C.; Foresti, G. Exploiting a geometrically sampled grid in the steered response power algorithm for localization improvement. J. Acoust. Soc. Am. 2017, 141, 586–601. [Google Scholar] [CrossRef] [Green Version]
  27. Zotkin, D.N.; Duraiswami, R. Accelerated speech source localization via a hierarchical search of steered response power. IEEE Trans. Speech Audio Process. 2004, 12, 499–508. [Google Scholar] [CrossRef]
  28. Khanal, S.; Silverman, H.F. Multi-stage rejection sampling (MSRS): A robust SRP-PHAT peak detection algorithm for localization of cocktail-party talkers. In Proceedings of the 2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NY, USA, 18–21 October 2015; pp. 1–5. [Google Scholar]
  29. Nunes, L.O.; Martins, W.A.; Lima, M.V.; Biscainho, L.W.; Gonçalves, F.M.; Said, A.; Lee, B. A steered-response power algorithm employing hierarchical search for acoustic source localization using microphone arrays. IEEE Trans. Signal Process. 2014, 62, 5171–5183. [Google Scholar] [CrossRef]
  30. Cobos, M.; Antonacci, F.; Comanducci, L.; Sarti, A. Frequency-Sliding Generalized Cross-Correlation: A Sub-Band Time Delay Estimation Approach. IEEE/ACM Trans. Audio Speech Lang. Process. 2020, 28, 1270–1281. [Google Scholar] [CrossRef] [Green Version]
  31. Tian, Z.; Liu, W.; Ru, X. Multi-Target Localization and Tracking Using TDOA and AOA Measurements Based on Gibbs-GLMB Filtering. Sensors 2019, 19, 5437. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  32. Kaplan, L.; Le, Q.; Molnár, N. Maximum likelihood methods for bearings-only target localization. In Proceedings of the 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing, Salt Lake City, UT, USA, 7–11 May 2001; Volume 5, pp. 3001–3004. [Google Scholar]
  33. Griffin, A.; Alexandridis, A.; Pavlidi, D.; Mastorakis, Y.; Mouchtaris, A. Localizing multiple audio sources in a wireless acoustic sensor network. Signal Process. 2015, 107, 54–67. [Google Scholar] [CrossRef]
Figure 1. Comparison of the ideal steered response power (SRP)-based source localization in an ideal case and with the unexpected effects (the symbols “o” and “+” represent the source position and the estimated position, respectively): (a) SRP map (3D view); (b) Ideal SRP map (2D view); (c) defocus effect from steering time uncertainties; (d) undersampled effect from coarse grid; (e) rippling effect from band-pass generalized cross-correlations (GCCs); (f) combined effect.
Figure 1. Comparison of the ideal steered response power (SRP)-based source localization in an ideal case and with the unexpected effects (the symbols “o” and “+” represent the source position and the estimated position, respectively): (a) SRP map (3D view); (b) Ideal SRP map (2D view); (c) defocus effect from steering time uncertainties; (d) undersampled effect from coarse grid; (e) rippling effect from band-pass generalized cross-correlations (GCCs); (f) combined effect.
Sensors 21 01591 g001
Figure 2. Illustration of the level-pass area M α , x s . (Orange: M 0.3 , x s ; yellow green: M 0.2 , x s ; celeste: M 0.1 , x s ).
Figure 2. Illustration of the level-pass area M α , x s . (Orange: M 0.3 , x s ; yellow green: M 0.2 , x s ; celeste: M 0.1 , x s ).
Sensors 21 01591 g002
Figure 3. An example of R 0 ( τ ) .
Figure 3. An example of R 0 ( τ ) .
Sensors 21 01591 g003
Figure 4. An example of refined GCC from field data: (a) GCC-Phase Transform (PHAT); (b) refined GCC.
Figure 4. An example of refined GCC from field data: (a) GCC-Phase Transform (PHAT); (b) refined GCC.
Sensors 21 01591 g004
Figure 5. Simulation comparison in the cumulative distribution function (CDF) of relative distance error. (a) small steering time difference of arrival (TDOA) uncertainty and small grid distance (STSG); (b) large steering TDOA uncertainty and small grid distance (LTSG); (c) small steering TDOA uncertainty and large grid distance (STLG); (d) large steering TDOA uncertainty and large grid distance (LTLG).
Figure 5. Simulation comparison in the cumulative distribution function (CDF) of relative distance error. (a) small steering time difference of arrival (TDOA) uncertainty and small grid distance (STSG); (b) large steering TDOA uncertainty and small grid distance (LTSG); (c) small steering TDOA uncertainty and large grid distance (STLG); (d) large steering TDOA uncertainty and large grid distance (LTLG).
Sensors 21 01591 g005
Figure 6. The mean absolute errors (MAEs) under different conditions. (a) small steering TDOA uncertainty (ST) ( Δ τ m a x = 0.1 ms, d g [ 0.1 m, 50 m]); (b) large steering TDOA uncertainty level (LT) ( Δ τ m a x = 100 ms, d g [ 0.1 m, 50 m]); (c) small grid distance (SG) ( d g = 0.1 m, Δ τ m a x [ 0.1 ms, 100 ms]); (d) large grid distance (LG) ( d g = 10 m, Δ τ m a x (0.1 ms,100 ms)).
Figure 6. The mean absolute errors (MAEs) under different conditions. (a) small steering TDOA uncertainty (ST) ( Δ τ m a x = 0.1 ms, d g [ 0.1 m, 50 m]); (b) large steering TDOA uncertainty level (LT) ( Δ τ m a x = 100 ms, d g [ 0.1 m, 50 m]); (c) small grid distance (SG) ( d g = 0.1 m, Δ τ m a x [ 0.1 ms, 100 ms]); (d) large grid distance (LG) ( d g = 10 m, Δ τ m a x (0.1 ms,100 ms)).
Sensors 21 01591 g006
Figure 7. Setup of the field experiment (a) Device. (b) Distribution. (c) Estimated power spectrum density of sensor signal 30 m away from source. (d) Estimated signal to noise ratio.
Figure 7. Setup of the field experiment (a) Device. (b) Distribution. (c) Estimated power spectrum density of sensor signal 30 m away from source. (d) Estimated signal to noise ratio.
Sensors 21 01591 g007
Figure 8. Experiment results: (a) MAE comparison; (b) CDF of relative error at d g = 0.1 m; (c) CDF of relative error at d g = 1 m; (d) CDF of relative error at d g = 10 m.
Figure 8. Experiment results: (a) MAE comparison; (b) CDF of relative error at d g = 0.1 m; (c) CDF of relative error at d g = 1 m; (d) CDF of relative error at d g = 10 m.
Sensors 21 01591 g008
Table 1. Mean absolute error (MAE) and 95th percentile under different conditions in the simulation.
Table 1. Mean absolute error (MAE) and 95th percentile under different conditions in the simulation.
MAE (m)
ConditionPSPESPMPEMWR
STSG0.810.071.010.070.06
LTSG44.5329.2752.0436.3713.16
STLG51.9015.3942.974.074.46
LTLG77.6450.7470.3722.8813.65
95th percentile (m)
ConditionPSPESPMPEMWR
STSG2.830.172.990.180.17
LTSG123.1382.61128.10118.6133.43
STLG147.0458.81124.397.119.24
LTLG172.37139.73163.9574.0734.68
Table 2. Mean absolute error (MAE) and 95th percentile under different conditions in the field experiment.
Table 2. Mean absolute error (MAE) and 95th percentile under different conditions in the field experiment.
MAE (m)
ConditionTCCSFCSBPESPEMWR
no grid102.2-----
d g = 0.1 m-79.223.57.118.71.4
d g = 1 m-83.033.012.627.42.0
d g = 10 m-93.366.042.646.17.2
95th percentile (m)
ConditionTCCSFCSBPESPEMWR
no grid322.8-----
d g = 0.1 m-146.5100.853.7105.05.4
d g = 1 m-150.4113.191.6105.16.0
d g = 10 m-171.8149.0138.5104.621.0
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Huang, Y.; Tong, J.; Hu, X.; Bao, M. A Robust Steered Response Power Localization Method for Wireless Acoustic Sensor Networks in an Outdoor Environment. Sensors 2021, 21, 1591. https://doi.org/10.3390/s21051591

AMA Style

Huang Y, Tong J, Hu X, Bao M. A Robust Steered Response Power Localization Method for Wireless Acoustic Sensor Networks in an Outdoor Environment. Sensors. 2021; 21(5):1591. https://doi.org/10.3390/s21051591

Chicago/Turabian Style

Huang, Yiwei, Jianfei Tong, Xiaoqing Hu, and Ming Bao. 2021. "A Robust Steered Response Power Localization Method for Wireless Acoustic Sensor Networks in an Outdoor Environment" Sensors 21, no. 5: 1591. https://doi.org/10.3390/s21051591

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop