A Robust Steered Response Power Localization Method for Wireless Acoustic Sensor Networks in an Outdoor Environment

Huang, Yiwei; Tong, Jianfei; Hu, Xiaoqing; Bao, Ming

doi:10.3390/s21051591

Open AccessArticle

A Robust Steered Response Power Localization Method for Wireless Acoustic Sensor Networks in an Outdoor Environment

¹

Key Laboratory of Noise and Vibration Research, Institute of Acoustics, Chinese Academy of Sciences, Beijing 100190, China

²

University of Chinese Academy of Sciences, Beijing 100049, China

^*

Author to whom correspondence should be addressed.

Sensors 2021, 21(5), 1591; https://doi.org/10.3390/s21051591

Submission received: 19 January 2021 / Revised: 19 February 2021 / Accepted: 19 February 2021 / Published: 25 February 2021

(This article belongs to the Special Issue Sensor Fusion and Signal Processing)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

The localization of outdoor acoustic sources has attracted attention in wireless sensor networks. In this paper, the steered response power (SRP) localization of band-pass signal associated with steering time delay uncertainty and coarser spatial grids is considered. We propose a modified SRP-based source localization method for enhancing the localization robustness in outdoor scenarios. In particular, we derive a sufficient condition dependent on the generalized cross-correlation (GCC) waveform function for robust on-grid source localization and show that the SRP function with GCCs satisfying this condition can suppress the disturbances induced by the grid distance and the uncertain steering time delays. Then a GCC refinement procedure for band-pass GCCs is designed, which uses complex wavelet functions in multiple sub-bands to filter the GCCs and averages the envelopes of the filtered GCCs as the equivalent GCC to match the sufficient condition. Simulation results and field experiments demonstrate the excellent performance of the proposed method against the existing SRP-based methods.

Keywords:

source localization; wireless acoustic sensor networks; steered response power; generalized cross-correlation

1. Introduction

With the rapid development of communication technology and mobile computing devices, applications of wireless acoustic sensor networks (WASNs) are becoming popular in acoustic signal processing. Particularly, WASN-based sound source localization has captured researchers’ attention in the last two decades [1,2,3,4,5]. The existing methods available for passive source localization in WASNs include (1) the received energy-based approaches [6,7,8,9]; (2) the direction of arrival (DOA)-based approaches [10,11]; (3) the time of arrival (TOA)-based approaches [12]; (4) the time difference of arrival (TDOA)-based approaches [13,14,15] and (5) the steered response power (SRP)-based approaches [16,17,18,19,20,21,22].

Most methods require a pre-processing stage in which specific modalities are measured from sensor signals before the location-estimating stage. In contrast, the SRP-based approaches locate the source position or direction by maximizing the power of spatially steered filter and sum beamformer of a group of sensors and contain only one decision step in processing sensor signals to estimate location. Without information compression and disturbances resulting from partial mistakes in the front-end stage, the SRP-based solutions can usually yield more robust performance in noisy and reverberant acoustic environments. Practical implementations commonly use the generalized cross-correlation [23]-based form of the SRP function [16] to reduce computation. The methods similar to the GCC-expression of SRP function are also called a “global coherence field (GCF)” in several references [24,25].

In practice, the primary constraint of the SRP-based approaches is the time-consuming on-grid searching procedure for finding their global maximums. Hence, it has been a hot issue to reduce the computational cost for the SRP-based approaches. In [17], a stochastic region construction (SRC) method is proposed to avoid global grid searching. However, this strategy also causes information loss. In [26], a geometrically sampled grid set based on the TDOA gradient is proposed to improve the SRP performances. An alternative strategy to solve the high-cost searching problem is adopting some adaptive SRP functions regarding the grid resolution to apply a coarse or a hierarchical searching. In [27], the authors use the low-frequency component of GCC for coarse grid resolution and the high-frequency component for fine grids in the SRP-based DOA estimation. In [28], the authors adopt a Gaussian low-pass filter to the GCC for coarse grids. For full-band signals, a similar kind of modification is proposed both in microphone arrays [29] and WASNs [18,19], respectively, in which the spatial spectrum of a given grid is calculated from the sum of the phase-transform weighted GCCs (GCC-Phase Transform (PHAT)s) within a time window containing the TDOA values in the volume surrounding the grid, instead of the original GCC-PHAT in the SRP function.

The SRP-based approaches can provide a robust solution in DOA estimation and source localization tasks in confined spaces. However, they could lose their robustness in an outdoor WASN scenario due to the synthetic effect of the following factors. (1) Grid size, since the monitoring area in outdoor cases may become much more extensive than the area of indoor applications, and the proper searching grids would be much coarser (e.g., meter-level grids outdoors compared with centimeter-level grids indoors). (2) Steering time delay uncertainty; in the classical SRP-based localization frame, the steering time delay at a given position is generated from an ideal propagation model and is always assumed to be entirely right. However, the steering time delay to the source position is different from the actual propagation time. Such a difference becomes no more negligible in the outdoor environment and causes a defocus effect, even though the WASN system is well synchronized. (3) Signal passband; when processing the acoustic data collected in outdoor environments, high-pass or band-pass filtering is indispensable because the environmental noise is intense in the low-frequency range, and the source signals in the real world often possess the band-pass characteristic. The synthetic effect of these three factors would make it difficult to achieve stable localization results. The Modified-SRP functional (MSRP) method introduced in [18,19] provides an elegant solution for scalable grids but it is not suitable for band-pass signals. In [21], the authors elaborate on the SRP in band-pass situations and use the GCC-PHAT envelope or frequency-shifted GCC-PHAT to enhance the robustness in such situations. Nevertheless, the above two methods hardly consider the other two factors (the grid and the steering time uncertainty). In [30], the authors propose a Frequency-Sliding GCC (FSGCC) method, which uses singular value decomposition (SVD) or weighted SVD (WSVD) on the FSGCC matrix and can intelligently extract time delay information of the source signal from multiple sub-band GCCs. The authors adopt the WSVD-FSGCC to the MSRP functional for source localization. This solution can provide excellent localization performance in the band-pass situation with scalable grids. However, in outdoor applications, the high computation cost of the SVD of giant matrices is inevitable due to the long GCC range.

Previously, several common acoustic source placements have been proposed in outdoor scenarios. They mostly focus on localizing the source from TDOA [31] and DOA [32,33] measurements. Some uncertainties are then introduced by the estimation error of TDOA or DOA estimating algorithms. Moreover, some useful information is also compressed, which results in unstable performance. In this direction, in this paper, a robust SRP-based outdoor source localization problem is discussed.

In this paper, a modified SRP-based method is proposed, in which the systematic influence of the above inevitable factors in outdoor WASNs scenarios is considered. The localization performance is analyzed using the normalized contribution of the signal components in the SRP function. A sufficient condition dependent on the GCC waveform function for robust on-grid SRP-based source localization is derived by geometrical analysis. The SRP function using GCCs satisfying this condition can suppress the disturbances induced by the grid distance and the uncertain steering time delay. A GCC refinement procedure for band-pass GCCs is then designed, which uses the complex wavelet functions in multiple sub-bands to filter the GCC and averages the envelopes of the filtered GCCs as the equivalent GCC to match the sufficient condition. Simulation results and field experiments demonstrate the excellent performance of the proposed method against the existing SRP-based methods.

The rest of this paper is organized as follows. In Section 2, the outdoor SRP-based source localization problem is formulated. Section 3 gives the sufficient condition in brief and introduces the GCC refinement procedure. The results of the simulation and the field experiment are presented in Section 4. Conclusions are given in Section 5.

2. SRP-Based Localization in Outdoor Acoustic Sensor Network

2.1. System Models

We discuss the acoustic source localization problem in an N-dimensional Euclidean space with M distributed microphones (

M > N

). Let

x \in R^{N}

be a spatial coordinate vector. Specifically, define

x_{s}

as the source location and

z_{m}

as the position of the

m_{t h}

sensor (

m = 1, 2, \dots, M

). Let

s (t)

be the source signal in the time domain, and the received signal of the microphone at

z_{m}

can be modeled as

y_{m} [n] = [h_{m} (t) * s (t) + w_{m} (t)] δ (t - n / F_{s}),

(1)

where

h_{m} (t)

is the impulse response function representing the propagation of sound from

x_{s}

to

z_{m}

, the operator “∗” represents the convolution operation,

w_{m} (t)

stands for the additive noise signal, and

δ (t - n / F_{s})

denotes the sampling process at rate

F_{s}

. When the multi-path delay and non-linear distortion are neglected, the propagation function in the frequency domain can be simplified as

H_{m} (ω) = A_{m} e^{- j ω t_{m}},

(2)

where

A_{m} \in R

is the amplitude-attenuation factor and

t_{m}

is the time delay factor. In the frequency domain Equation (1) can be denoted as

Y_{m} (Ω) = A_{m} S (Ω) e^{- j Ω F_{s} t_{m}} + W_{m} (Ω),

(3)

where

Ω = ω / F_{s} \in [- π, π]

is the normalized angular frequency,

Y_{m} (Ω)

is the discrete-time Fourier transform (DTFT) of

y_{m} [n]

,

S (Ω)

and

W_{m} (Ω)

are the Fourier transforms of

s (t)

and

w_{m} (t)

, respectively.

Let

η_{m} (x) \in R

be the steering time delay function describing the time delay associated with sound propagation from a given location

x

to

z_{m}

. In practice, it is commonly modeled as the sound traveling time going through the line-of-sight (LOS) path with a constant sound speed

v_{s}

; i.e.,

η_{m} (x) = | | x - z_{m} | | / v_{s},

(4)

where “

∥.∥

” denotes the Euclidean distance. Note that

η_{m} (x)

is not exactly the sound propagation in reality. Then the SRP function, which is defined as the output power of the filtered-and-sum beam-former, is given by:

P (x) = \int_{- π}^{π} {|\sum_{m = 1}^{M} G_{m} (Ω) Y_{m} (Ω) e^{j Ω_{m} F_{s} η_{m} (x)}|}^{2} d Ω,

(5)

where

G_{m} (Ω) e^{j Ω_{m} F_{s} η_{m} (x)}

is the filter associated with the

m_{t h}

sensor. It can be equivalently expressed in term of GCCs [16]:

P (x) = 2 π \sum_{l = 1}^{M} \sum_{m = 1}^{M} R_{l, m} (η_{m} (x) - η_{l} (x)),

(6)

where

R_{l, m} (τ) = \frac{1}{2 π} \int_{- π}^{π} Ψ_{l, m} (Ω) Y_{l} (Ω) Y_{m}^{*} (Ω) e^{j Ω F_{s} τ} d Ω

(7)

denotes the GCC of the sensor pair

{l, m}

,

τ

is the time lag, superscript “

{(.)}^{*}

” represents the conjugate operation,

Ψ_{l, m} (Ω) = G_{l} (Ω) G_{m}^{*} (Ω)

and denotes the weight function of the associated GCC. Ideally, each

R_{l, m} (τ)

achieves its peak at

τ = t_{m} - t_{l}

so that the SRP function is supposed to achieve its maximum value at the source position

x_{s}

, as shown in Figure 1a,b. The Phase Transform (PHAT) weight function

Ψ_{l, m}^{P H A T} (Ω) = 1 / |Y_{l} (Ω) Y_{m}^{*} (Ω)|

(8)

is widely used in the TDOA- and SRP-based localization applications. The PHAT-weighted GCC is generally referred to as the GCC-PHAT, and the SRP using the GCC-PHAT is generally referred to as the SRP-PHAT.

Removing those irrelevant and repetitive terms in Equation (6), the effective component for source localization can be simplified as

P_{E} (x) = \sum_{l = 1}^{M - 1} \sum_{m = l + 1}^{M} R_{l, m} (η_{m} (x) - η_{l} (x)) = \sum_{p = 1}^{C_{M}^{2}} R_{p} (τ_{p} (x)),

(9)

where p is the sequence number of the valid sensor pair

c_{p} = {l, m} (l < m)

and is deduced to be

p = (2 M - l) (l - 1) / 2 + m - l

, varying from one to a combinatorial number

C_{M}^{2}

;

τ_{p} (x) = η_{m} (x) - η_{l} (x)

and can be referred to as the steering TDOA function.

2.2. Problem Formulation

The classical SRP-based localization method often lacks robustness in outdoor scenarios. The steering time delay function

η_{m} (x)

in the SRP function is different from the sound propagation in reality denoted as

η_{m}^{0} (x)

, and

Δ η_{m} (x) = η_{m} (x) - η_{m}^{0} (x)

is denoted as the steering time-uncertainty function. Similarly, the steering TDOA-uncertainty functions in a pair of sensors can be expressed as

Δ τ_{p} (x) = Δ η_{m} (x) - Δ η_{l} (x) = τ_{p} (x) - τ_{p}^{0} (x),

(10)

where

τ_{p}^{0} (x) = η_{m}^{0} (x) - η_{l}^{0} (x)

, representing the real steering TDOA function for a given sensor pair

c_{p}

. This term is usually negligible within a confined space, so it has been rarely discussed in classical SRP models. However, in outdoor applications, the sound propagation is much more unpredictable, resulting in enlarged uncertainty with the increase in distances. The steering time uncertainty can easily be influenced by the geography, temperature, wind, and self-localization error among sensors, and then yields a noticeable defocus effect on the SRP map, as shown in Figure 1c. The GCCs would intersect with each other dispersedly around

x_{s}

.

Since the spatial spectrum generated by the SRP function contains many local extrema and ridged areas, the maximal value of

P (x)

is usually found through a grid-searching process. Consider a uniform sampling grid (USG) case in

R^{N}

. Define

X_{g}

as the set of grid points in the candidate searching region (

V \in R^{N}

), and

d_{g} \in R

,

N_{g} \in R

as the grid distance and the total number of the grids in

X_{g}

, respectively, then the estimated on-grid location is formulated as

{\hat{x}}_{s} = arg max_{x \in X_{g}} P (x) = arg max_{x \in X_{g}} P_{E} (x) .

(11)

Note that the localization precision depends on the gird resolution. A more accurate estimation usually requires a smaller

d_{g}

. This will leads to a larger

N_{g}

and significantly increased calculation burden because the number of grids is inversely proportional to the

N_{t h}

power of

d_{g}

(i.e.,

N_{g} \propto {(d_{g})}^{- N}

). Hence, the accuracy and feasibility can hardly be balanced in an outdoor WASN system confronting a large search region, for which the minimal grid resolution limited by computing power is much coarser than that in indoor applications. However, most SRP approaches usually work well at subtle grid resolutions, and coarser grid resolution has an undersampled effect, as shown in Figure 1d. The searching process probably would miss the source peak.

It is known that the background noise always dominates at low frequencies in the field environment, and real sound sources often show band-pass characteristics. Thus a band-pass GCC is indeed required. However, the SRP-PHAT with a band-pass source would cause a rippling effect [21], as shown in Figure 1e. The rippling effect does not alter the location of the maximal value of the SRP function. However, it may lead to local extrema and even fake peaks such that the SRP spectrum is susceptible to the two other factors and shows a lack of robustness.

Under the influence of the synthetic effect of the above inevitable factors, the real-world SRP output is illustrated in Figure 1f. It shows that classical SRP implementations hardly deal with all these factors outdoors and yield a divergent localization result.

3. A Robust Outdoor SRP-Based Source Localization Method

3.1. On-Grid SRP-Based Localization Error Bound Condition

It is known that the SRP-based spatial spectra mainly depend on the phase information of the source components. It is always reasonable to assume that the additive noise of sensors is independent of each other and the source signal, and then it has no spatial preference (which means that they have zero mean in the phase domain). Their contributions to the SRP spectrum can be neglected and not related to the grid resolution and the steering time uncertainty. Therefore, only the contribution of the source signal is considered in analyzing the SRP function. With the terms of additive noise

w_{m} (τ)

neglected, the weight functions

Ψ_{p} (Ω)

of the sensor pair

c_{p}

usually can be expressed as

Ψ_{p} (Ω) = B_{p} Ψ_{0} (Ω),

(12)

where

B_{p} \in R

is an amplitude-scaling factor irrelevant to the frequency, and

Ψ_{0} (Ω) = Ψ_{0} (- Ω) \in R

is a real function irrelevant to sensors. Substituting Equation (12) into Equation (7), the GCC

R_{p} (τ)

can be rewritten as

\begin{matrix} R_{p} (τ) & = \frac{B_{p} A_{l} A_{m}}{2 π} \int_{- π}^{π} Ψ_{0} (Ω) S (Ω) S^{*} (Ω) e^{j Ω F_{s} (τ - τ_{p}^{0} (x_{s}))} d Ω \\ = \frac{B_{p} A_{l} A_{m} C_{0}}{2 π} R_{0} (τ - τ_{p}^{0} (x_{s})), \end{matrix}

(13)

where

C_{0} = max |\int_{- π}^{π} Ψ_{0} (Ω) S (Ω) S^{*} {(Ω)}^{j Ω F_{s} τ} d Ω|

, and

R_{0} (τ) = \frac{1}{C_{0}} \int_{- π}^{π} Ψ_{0} (Ω) S (Ω) S^{*} (Ω) e^{j Ω F_{s} τ} d Ω,

(14)

is the amplitude-normalized version of the weighted self-correlation function of the source signal

s (t)

. Hence, each GCC contains the same waveform function

R_{0} (τ)

with different time-shifting factors

τ_{p}^{0} (x_{s})

and amplitude factors

B_{p} A_{l} A_{m} / C_{0}

. In practice, the range information in amplitude is usually less stable or accurate than in time delay. Thus, a normalized mapping function representing the contribution of the source component in the SRP function can be constructed as

F_{E} (x, x_{s}) = \frac{1}{C_{M}^{2}} \sum_{p = 1}^{C_{M}^{2}} R_{0} (τ_{p} (x) - τ_{p}^{0} (x_{s})) .

(15)

In the above equation, the amplitude factors

B_{p} A_{l} A_{m} / C_{0}

between different sensor pairs are removed. Thus, each pair yields an equal contribution to the SRP function. Note that

F_{E} (x) \in [- 1, 1]

has a definite value range regardless of the sensor number M.

For a given grid distance

d_{g} \in R_{> 0}

, an arbitrary uniform sampling grid set in

R^{N}

can be expressed as

X (d_{g}, x_{g}^{o}) = \{x + x_{g}^{o} : x = {[n_{1} d_{g}, \dots, n_{N} d_{g}]}^{T}; n_{1}, \dots, n_{N} \in Z\},

(16)

where

x_{g}^{o} \in R^{N}

is the position of the origin of the set. Then the on-grid location estimation is given by

\begin{matrix} {\hat{x}}_{s}^{g} & = arg max_{x \in X (d_{g}, x_{o})} F_{E} (x, x_{s}) \\ = arg max_{x \in X (d_{g}, x_{g}^{o})} \frac{1}{C_{M}^{2}} \sum_{p = 1}^{C_{M}^{2}} R_{0} (τ_{p} (x) - τ_{p} (x_{s}) + Δ τ_{p} (x)) . \end{matrix}

(17)

It is worth pointing out that the grid resolution, the steering time uncertainty, and band-pass issues are comprehensively considered in the above-simplified SRP function.

The grid issue should be unrelated to the origin position

x_{g}^{o}

. In the real world, the uncertainty functions

Δ τ_{p} (x)

are hard to closely describe due to many interference factors, and it is reasonable to assume that they have an upper bound

Δ τ_{m a x}

(i.e.,

|Δ τ_{p} (x)| \leq Δ τ_{m a x}

).

Δ τ_{m a x}

indicates the steering time delay uncertainty level and can be estimated from the environmental and devices’ conditions. Thus, the robustness of the on-grid localization problem can be described as: given a

d_{g}

and a

Δ τ_{m a x}

, there exists a

ε \in (0, \infty)

such that

∥ {\hat{x}}_{s}^{g} - x_{s} ∥ \leq ε .

(18)

Define a level-passed area based on

F_{E} (x, x_{s})

:

M (α, x_{s}) ≜ {x : F_{E} (x, x_{s}) \geq α} \subseteq R^{N},

(19)

where

α \in R

is the level-pass threshold. Then a sufficient condition can be obtained in the following Proposition:

Proposition 1.

if

M (α, x_{s}) \cap X (d_{g}, x_{g}^{o}) \neq \emptyset

and

M (α, x_{s})

is a bounded set (i.e., there exists a

ε_{M} \in (0, \infty)

such that

∥ x_{1} - x_{2} ∥ \leq ε_{M}

for all

x_{1}, x_{2} \in M (α, x_{s})

), then Inequality (18) is satisfied.

The proof is given in Appendix A.1. Thus, the robustness of the on-grid source localization problem can be analyzed in terms of

M (α, x_{s})

.

A practical example of

M (α, x_{s})

is depicted in Figure 2, and its area shrinks inwards when

α

increases. The first sub-condition (

M (α, x_{s}) \cap X (d_{g}, x_{g}^{o}) \neq \emptyset

) can be satisfied when

M (α, x_{s})

covers enough areas. The shape of

M (α, x_{s})

relates to

α

,

R_{0} (τ)

,

Δ τ_{p} (x)

, and sensor distribution, and it is generally irregular. Consider a closed ball

B^{N} (x_{0}, r) ≜ \{x : | x - x_{0} | \leq r; x_{0}, x \in R^{N}\}

with center

x_{0}

and radius r. If

r \geq d_{g} \sqrt{N} / 2,

(20)

then

B^{N} (x_{0}, r) \cap X (d_{g}, x_{g}^{o}) \neq \emptyset

is satisfied. The proof can be seen in Appendix A.2. Consequently, if

B^{N} (x_{s}, d_{g} \sqrt{N} / 2) \subseteq M (α, x_{s})

, then the first sub-condition is satisfied.

Figure 3 illustrates a typical waveform of

R_{0} (τ)

, the GCC-PHAT of the passband

[{Ω_{C} - Ω}_{B}, {Ω_{C} + Ω}_{B}] \subset (0, π]

, which can be expressed by

R_{0}^{P H A T - B P} (τ) = sinc (\frac{Ω_{B} F_{s}}{π} τ) cos (Ω_{C} F_{s} τ) .

(21)

A valid

R_{0} (τ)

is an even and bounded function (i.e.,

R_{0} (τ) = R_{0} (- τ)

and

R_{0} (τ) \in [- 1, 1]

) and contains a main-lobe around

τ = 0

, where its maximum

a_{m}

lies. The maximum side-lobe height (or the maximum value outside the main-lobe area if

R_{0} (τ)

has no side-lobes) can be denoted as

a_{s}

, where

a_{s} < a_{m}

.

Let us define a function based on

R_{0} (τ)

by

T_{R} (a_{T}) ≜ inf {| τ | : R_{0} (τ) < a_{T}},

(22)

where

a_{T} \in [a_{S}, a_{M}]

is the level-pass threshold of GCC, “

inf {.}

” represents the infimum.

T_{R} (a_{T})

represents the half-width of the level-passed section of

R_{0} (τ)

within its main-lobe. It follows that

R_{0} (τ) \geq a_{T}

if and only if

τ \in (- T_{R} (a_{T}), T_{R} (a_{T}))

.

Based on a geometrical analysis in Appendix A.3, if

R_{0} (τ)

possesses the following property:

T_{R} (α) \geq d_{g} \sqrt{N} / v_{s} + Δ τ_{m a x},

(23)

then

M (α, x_{s}) \supset B^{N} (x_{s}, d_{g} \sqrt{N} / 2)

. Therefore, the first sub-condition can be satisfied.

For all

α

such that

α > {max}_{∥ x ∥ \to + \infty} {F_{E} (x, x_{s})}

, the second sub-condition (

M (α, x_{s})

is a bounded set) is satisfied. The area of

M (α, x_{s})

is mainly the superposition of the projection area of the main-lobe sections of GCCs belonging to individual sensor pairs. Denote

Λ_{p} (τ^{c}, T) = {x : | τ_{p} (x) - τ^{c} | \leq T}

to be the projection area of the TDOA section

[τ^{c} - T, τ^{c} + T]

of sensor pair

c_{p}

, where

T \in [0, \infty)

and

τ^{c} \in [- τ_{p}^{m a x}, τ_{p}^{m a x}]

are the half-width and the central TDOA of the section, respectively, and

τ_{p}^{m a x} = ∥ z_{l} - z_{m} ∥ / v_{s}

is the maximal TDOA value that this sensor pair can produce.

For each sensor pair

c_{p}

, the solution set of the half hyperbolic equation

τ_{p} (x) = τ^{c}

can be denoted as

Λ_{p} (τ^{c}, 0)

and extends to infinity (i.e., there exists an

x

such that

∥ x ∥ = \infty

and

x \in Λ_{p} (τ^{c}, 0)

). For two different sensor pairs

c_{i}

and

c_{j}

, if there exist a

τ_{i}^{c} \in [- τ_{i}^{m a x}, τ_{i}^{m a x}]

and a

τ_{j}^{c} \in [- τ_{j}^{m a x}, τ_{j}^{m a x}]

such that

Λ_{i} (τ_{i}^{c}, 0) \subseteq Λ_{j} (τ_{j}^{c}, 0)

or

Λ_{i} (τ_{i}^{c}, 0) ⊉ Λ_{j} (τ_{j}^{c}, 0)

, then the half hyperbolic functions

τ_{i} (x) = τ_{i}^{c}

and

τ_{j} (x) = τ_{j}^{c}

are not independent. The sense might occur when the sensors of these two pairs are co-linear or have the same axis of symmetry; in the meantime, both

τ_{i}^{c}

and

τ_{j}^{c}

reach their extremum or become zero. In WASNs, this case rarely happens because the sensor distributions are often irregular. Despite this sense for all sensor pairs, the maximal value of

F_{E} (x, x_{s})

at infinity does not exceed a linear combination of

a_{m}

and

a_{s}

, which is given as

α_{i n f} = \frac{C_{N}^{2} a_{m} + (C_{M}^{2} - C_{N}^{2}) a_{s}}{C_{M}^{2}} .

(24)

The detailed derivation can be found in Appendix A.4. If

α > α_{i n f}

, then

M (α, x_{s})

is bounded.

Combining Inequality (23) and Equation (24) together, a sufficient condition for robust on-grid source localization is given by

T_{R} (α_{i n f}) > d_{g} \sqrt{N} / v_{s} + Δ τ_{m a x} .

(25)

It means that for a given grid distance

d_{g}

and steering TDOA uncertainties within

Δ τ_{m a x}

, if the GCC waveform function

R_{0} (τ)

has a wide main-lobe satisfying this condition, then the divergent on-grid location estimation can be avoided.

The SRP-PHAT generates a sharp GCC to increase the TDOA resolution for cases with reverberation or multiple sources. However, as shown in Figure 3, the band-pass effect would bring a narrow main-lobe section and strong side-lobes to the GCC waveform function. It can hardly satisfy the requirement Inequality(25), which is also shown by the poor performance of SRP-PHAT in Figure 1f. Next, we will introduce a GCC waveform refinement procedure for the band-pass SRP.

3.2. Robust SRP-Based Source Localization with Refined GCC Waveform

The condition in Inequality (25) is too strict for band-pass GCC situations with coarse grid resolution and perceptible steering TDOA uncertainties. Some classical GCC methods utilized low-pass filtering to meet a broader main-lobe requirement, but they are not applicable for band-pass signals. In this section, the GCC is refined to obtain a suitable waveform to modify the SRP function.

Consider a complex wavelet function

ψ_{e} (τ, Ω_{C}) = u_{e} (τ) e^{- j Ω_{C} F_{s} τ}

, where

u_{e} (τ) \in L^{2} (R)

is an even symmetrical function. Applying

ψ_{e} (τ, Ω_{C})

as the filtering function on the GCC-PHAT, the filtered output of

c_{p}

can be denoted as

R_{p}^{C F} (τ, Ω_{C}) = R_{p}^{P H A T} (τ) * ψ_{e} (τ, Ω_{C}),

(26)

where

R_{p}^{P H A T} (τ)

is the GCC-PHAT of

c_{p}

.

When the real function

u_{e} (τ)

has an effective support

[- Ω_{B}, Ω_{B}] \subset [- π, π]

in the frequency domain, i.e.,

\int_{- \infty}^{\infty} | U_{e} {(Ω) |}^{2} d Ω - \int_{- Ω_{B}}^{Ω_{B}} | U_{e} {(Ω) |}^{2} d Ω ≪ \int_{- Ω_{B}}^{Ω_{B}} {| U_{e} (Ω) |}^{2} d Ω,

(27)

where

U_{e} (Ω)

is the Fourier Transform of

u_{e} (τ)

, and if the source is dominant in the frequency band

[Ω_{C} - Ω_{B}, Ω_{C} + Ω_{B}] \subseteq (0, π]

, then the approximation

\begin{matrix} R_{p}^{C F} (τ, Ω_{C}) & = \frac{1}{2 π} \int_{- \infty}^{\infty} \frac{Y_{p_{l}} (Ω) Y_{p_{m}}^{*} (Ω)}{| Y_{p_{l}} (Ω) Y_{p_{m}}^{*} (Ω) |} U_{e} (Ω - Ω_{C}) e^{j Ω F_{s} τ} d Ω \\ \approx \frac{1}{2 π} \int_{Ω_{C} - Ω_{B}}^{Ω_{C} + Ω_{B}} \frac{Y_{p_{l}} (Ω) Y_{p_{m}}^{*} (Ω)}{| Y_{p_{l}} (Ω) Y_{p_{m}}^{*} (Ω) |} U_{e} (Ω - Ω_{C}) e^{j Ω F_{s} τ} d Ω \\ \approx \frac{1}{2 π} \int_{Ω_{C} - Ω_{B}}^{Ω_{C} + Ω_{B}} e^{- j Ω F_{s} τ_{p}^{0} (x_{s})} U_{e} (Ω - Ω_{C}) e^{j Ω F_{s} τ} d Ω \\ \approx \frac{1}{2 π} \int_{- \infty}^{\infty} e^{- j Ω F_{s} τ_{p}^{0} (x_{s})} U_{e} (Ω - Ω_{C}) e^{j Ω F_{s} τ} d Ω \\ = u_{e} (τ - τ_{p}^{0} (x_{s})) e^{j Ω_{C} F_{s} (τ - τ_{p}^{0} (x_{s}))} \end{matrix}

(28)

exists. It can be observed that the approximate function carries the same envelope as

u_{e} (τ)

and extracts the TDOA information in

[Ω_{C} - Ω_{B}, Ω_{C} + Ω_{B}]

.

Note that the

R_{p}^{C F} (τ, Ω_{C})

is equal to the time domain approach of the sub-band GCC defined in [30]. Since the main goal is to obtain an equivalent GCC to match the sufficient condition in Inequality (25), a lightweight approach is to average the envelope of those filtered GCCs of multiple sub-bands in high SNR conditions. According to the power spectral density (PSD) of source signal or other prior knowledge,

N_{q}

valid sub-bands can be selected with individual central frequency

Ω_{q}

. The final refined GCC is given by

R_{p}^{W R} (τ) = \frac{1}{N_{q}} \sum_{q} | R_{p}^{C F} (τ, Ω_{q}) | \approx | u_{e} (τ - τ_{p}^{0} (x_{s})) |,

which has a specific waveform function

R_{0} (τ) \approx | u_{e} (τ) |

. Furthermore, the improved spatial function is calculated as

P^{W R} (x) = \frac{1}{C_{M}^{2}} \sum_{p = 1}^{C_{M}^{2}} R_{p}^{W R} (τ_{p} (x)) = \frac{1}{C_{M}^{2} N_{q}} \sum_{p = 1}^{C_{M}^{2}} \sum_{q = 1}^{N_{q}} | R_{p}^{C F} (τ_{p} (x), Ω_{q}) | .

(29)

The selection

u_{e} (τ)

has a significant influence on the refinement of GCC. Its envelope

| u_{e} (τ) |

provides the waveform function of refined GCCs. The suitable envelope of a suitable

u_{e} (τ)

should have no side-lobes, i.e.,

|u_{e} (τ_{1})| > |u_{e} (τ_{2})| \geq 0

for all

|τ_{1}| < |τ_{2}|

. Meanwhile, each

U_{e} (Ω - Ω_{q})

in the frequency domain serves as a band-pass filter, thus the spectral distribution of

U_{e} (Ω)

should be concentrated to satisfy Inequality (27). Gaussian function given by

u_{e} (τ) = e^{- {(Ω_{d} F_{s} τ)}^{2}}

(30)

which possesses the required properties both in the time domain and in the frequency domain. Then the corresponding complex filtering function

ψ_{e} (τ, Ω_{C})

can be regarded as a complex Morlet wavelet. According to (25), for a given grid distance

d_{g}

and steering TDOA uncertainty level

Δ τ_{m a x}

, the parameter

Ω_{d}

can be given by

Ω_{d} = v_{s} \sqrt{- ln α} / (F_{s} d_{g} \sqrt{N} + v_{s} Δ τ_{m a x}),

(31)

where N is the space dimension,

α

is the threshold value, which usually can be set as

α = 0.5

. Taking Equation (31) into Inequality (27) and dividing (27) by its right side term, it yields

[\int_{- \infty}^{\infty} e^{- {(\frac{Ω}{2 Ω_{d}})}^{4}} d Ω - \int_{- Ω_{B}}^{Ω_{B}} e^{- {(\frac{Ω}{2 Ω_{d}})}^{4}} d Ω] / (\int_{- Ω_{B}}^{Ω_{B}} e^{- {(\frac{Ω}{2 Ω_{d}})}^{4}} d Ω) ≪ 1 .

Thus, the relation of

Ω_{d}

and

Ω_{B}

can be obtained by the following equivalent equation:

2 Ω_{d} [\int_{0}^{\infty} e^{- Ω^{4}} d Ω - \int_{0}^{Ω_{B}} e^{- Ω^{4}} d Ω] / (\int_{0}^{Ω_{B}} e^{- Ω^{4}} d Ω) = c,

where c is an extremely small number. Then, it can be obtained that

Ω_{B} = 2 c_{e} Ω_{d},

(32)

where

c_{e}

is the positive solution of the following equation:

x E_{\frac{3}{4}} (x^{4}) = \frac{4 c}{1 + c} Γ (\frac{5}{4}),

where

E_{n} (x) = \int_{1}^{+ \infty} \frac{e^{- x t}}{t^{n}} d t, (x > 0)

and

Γ (x) = \int_{0}^{+ \infty} t^{x - 1} e^{- 1} d t, (x > 0)

. When c is set as 0.001(−30 dB),

c_{e}

in Equation (32) can be obtained as 2.89.

A simulation is performed to illustrate the effect of the GCC waveform refinement procedure on on-grid SRP-based source localization. As shown in Figure 4, the dot-dashed box shows the range of TDOA within the volume of the nearest gird

x_{g}

, the dashed line with “

Δ

” shows the real TDOA, which should coincide with the peak of the GCC; the dotted line with “∇” marks

R_{p} (τ_{p} (x_{g}))

, corresponding to the nearest gird

x_{g}

. The

R_{p} (τ_{p} (x_{g}))

of the traditional GCC-PHAT is small, thus leading to poor performance in grid searching. In contrast, the proposed refining method generates a smooth waveform and high values throughout the TDOA region indicated by the box in the figure.

The modified algorithm with the GCC refinement procedure is shown in Algorithm 1, in which

u_{e} (τ) = e^{- {(Ω_{d} F_{s} τ)}^{2}}

is taken as the target waveform function.

Algorithm 1: SRP with the waveform refinement procedure

Parameter Setting
(1) Set the maximum steering TDOA error

Δ τ_{m a x} = Δ τ_{m a x}^{C} + Δ τ_{m a x}^{S}

, where the sub-items

Δ τ_{m a x}^{C}

and

Δ τ_{m a x}^{S}

are determined by the wind and the synchronization error of sensors, respectively.
(2) Set the grid distance

d_{g}

and searching region

V

that meet the system requirement. Then the searching grid set

X_{g}

is generated.
(3) Set the waveform function

u_{e} (τ) = e^{- {(Ω_{d} F_{s} τ)}^{2}}

and

α =

0.5.
(4) Set

c =

0.001 and compute the bandwidth

Ω_{B}

using Equation (32).
Band selecting
(1) Set up the passband

[Ω_{L}, Ω_{U}]

(2) Pick up

N_{q}

highest PSD bands of the source or divide the passband uniformly.
Source Localization
(1) Calculate the refinement waveform (WR)-SRP function

P^{W R} (x)

by Equation (29) at all

x \in X_{g}

.
(2) Estimate the source location

{\hat{x}}_{s}

by Equation (11).

4. Experiment Results and Discussion

4.1. Numerical Simulations

In this section, we use Monte Carlo simulations to analyze the efficiency of the proposed SRP-based localization method (the SRP functional with the refinement waveform, referred to as WR), compared with the traditional SRP functional with GCC-PHAT (PS), the SRP functional—the envelope of GCC-PHAT (PES) that is designed for acoustic band-pass signals [21], the modified-SRP (M-SRP) functional with GCC-PHAT (PM) [18] in which grid resolution is considered, and the M-SRP functional with the envelope of GCC-PHAT (PEM) in which both band-pass and grid resolution are considered.

In this setup, M = 8 sensors and one source are randomly deployed in a monitored area of 200 m by 200 m. The propagation model is set to be the line-of-sight path with a constant sound speed of 345 m/s. The input GCCs are generated by the waveform function in Equation (21) with passband of

[0.15 π, 0.4 π]

. The steering TDOA uncertainty

Δ τ_{p} (x)

uniformly distributes over

[- Δ τ_{m a x}, Δ τ_{m a x}]

, where

Δ τ_{m a x}

is the maximal time uncertainty dependent on the sound-propagation model error and the synchronization error.

We consider four different conditions in WASNs to test the algorithms: (a) a small steering TDOA uncertainty and small grid distance (STSG) condition with

Δ τ_{m a x} = 0.1

ms,

d_{g} = 0.1

m, (b) a large steering TDOA uncertainty and small grid distance (LTSG) condition with

Δ τ_{m a x} = 100

ms,

d_{g} = 0.1

m, (c) a small steering TDOA uncertainty and large grid distance (STLG) condition with

Δ τ_{m a x} = 0.1

ms,

d_{g} = 10

m, (d) a large steering TDOA uncertainty and large grid distance (LTLG) condition with

Δ τ_{m a x} = 100

ms or

d_{g} = 10

m.

The mean absolute error (MAE)

E \{∥ {\hat{x}}_{s} - x_{s} ∥\}

of distance and the cumulative distribution function (CDF) of estimation errors of relative distance are calculated to evaluate the accuracy and robustness of these algorithms, where the relative distance in the cumulative distribution function (CDF) is normalized by the grid distance, i.e.,

F (e_{u}) = P \{∥ {\hat{x}}_{s} - x_{s} ∥ / d_{g} \leq e_{u}\},

(33)

where

e_{u}

is the relative positioning error that is determined as the system requirement. Specifically, the 95th percentile of the localization error in meters is computed as

F^{- 1} (0.95) \cdot d_{g}

.

The MAE and 95th percentile results are listed in Table 1. All the localization algorithms can obtain the best estimation accuracy in the STSD condition in which the defocus effect and undersampled effect are slight. When the steering TDOA uncertainty or the grid distance increases, the MAE would increase. However, compared with the PS, PES, PM, and PEM methods, the MAE in the WR has almost the smallest estimate error because all these factors have been considered. The 95th percentile has similar results with the MAE, which indicates that the proposed WR method has a stable localization performance in outdoor conditions.

Figure 5a–d depict the CDF of each algorithm in the range

e_{u} \in [0.5, 100 m / d_{g}]

under the four conditions. Specifically, the CDF curves will increase rapidly with the location error in the fine condition, and then the estimate errors are the smallest for all the algorithms in the STSG. The CDF curve will move down as the grid distance

d_{g}

and steering TDOA uncertainty

Δ τ_{m a x}

increase, such as in the LTSG, STLG, and LTLG. Since the steering TDOA uncertainty is not considered in PES and PEM, their descent range of CDF in the SDLG is lower than that in the LDSG. Among these localization algorithms, the CDF of the WR is the highest or very close to the highest (STLG), and the PEM method is better than the PS, PES, and PM. The proposed WR method is very robust even though the condition becomes abominable.

Furthermore, Figure 6 presents the MAE in four situations: (a) fixed small steering TDOA uncertainty (ST) with

Δ τ_{m a x}

= 0.1 ms,

d_{g}

ranges from 0.1 m to 50 m; (b) fixed large steering TDOA uncertainty level (LT) with

Δ τ_{m a x}

= 100 ms,

d_{g}

ranges from 0.1 m to 50 m; (c) fixed small grid distance (SG) with

d_{g}

= 0.1 m,

Δ τ_{m a x}

range from 0.1 ms to 100 ms; (d) fixed large grid distance (LG) with

d_{g}

= 10 m,

Δ τ_{m a x}

range from 0.1 ms to 100 ms. The MAE increases with

d_{g}

or

Δ τ_{m a x}

significantly, and this indicates that the steering TDOA uncertainty and grid distance have a severe influence on the performance of source localization. In each situation, the PS and PM produce larger MAE than the other algorithms when

d_{g}

and

Δ τ_{m a x}

are small because they are not applied to band-pass signals. Since the scalable grid sampling and steering TDOA uncertainty are not considered in the PES, it shows reliable performance only when

d_{g} \leq 1

m and

Δ τ_{m a x} \leq 1

ms. The PEM considered both grid size and band-pass effect; thus, it achieves the best performance in the small

Δ τ_{m a x}

case. However, the MAE becomes worse when the influence caused by the steering TDOA uncertainties is more significant than by the grid size. The WR obtains the MAE close to the PEM when

Δ τ_{m a x}

is small. Moreover, it is the smallest in all the other situations. These results abundantly demonstrate its excellent robust performance.

4.2. Field Experiment

In this experiment, seven nodes are distributed in a park, as shown in Figure 7a,b. Each node consists of a microphone sensor, a Wi-Fi module, and a GPS module for self-localization and time calibration. The monitoring area has the same 200 m × 200 m in addition with a hillock. A portable speaker generates the sound signals at 12 positions inside the area, such as the Gaussian signal (S-G), the whistle of vehicles (S-V) representing an urban source, and birdsong (S-B) representing a field source. The temperature was approximately 30 °C, and the wind speed is slower than 3 m/s. Therefore, in the proposed method

Δ τ_{m a x}

can be set to be 10 ms fully considering the self-localization error of the sensors and the effect of wind.

The sampling frequency is 10,000 Hz and Figure 7c shows the PSDs of both the background noise and received source signals, which are obtained with the Burg method of 50 order number and 2048 FFT length. The PSDs of the source signals are collected at about 30 m away from the speaker. Because the environmental noise is mainly distributed in the frequency bands below 1500 Hz, the passband is set to be (1500 Hz, 3500 Hz) for all sources. The estimated SNRs are shown in Figure 7d, and the SNRs of the full band (0, 5000 Hz) and of the passband (1500 Hz, 3500 Hz) are plotted in solid lines and dashed lines, respectively. For the three source types, the SNR is improved by 20 dB∼30 dB.

The recorded data are divided into 1242 two-second audio frames. SRP algorithms with full-band and band-pass cross-correlation (referred to as CSF and CSB) are added to analyze the necessity of band-pass signals. The PS and PM are not included since they have been proven unreliable in the simulation. Then the candidate SRP-based locators compared in this sub-section include: (1) SRP with full-band GCC (CSF), (2) SRP with band-pass GCC (CSB), (3) SRP with the envelope of band-pass GCC-PHAT (PES), (4) MSRP with the envelope of band-pass GCC-PHAT (PEM) and (5) WR-SRP with band-pass GCC (WR). A well known TDOA-based localization method [13] (referred to as TC) is also compared as a reference in which the TDOAs are obtained by band-pass GCC-PHATs.

The MAE and the 95th percentile of the localization errors of the TC method and the SRP-based methods with different grid distances (

d_{g} \in {0.1, 1, 10}

m) are listed in Table 2. Moreover the MAEs with grid distance

d_{g}

ranging from 0.1 m to 50 m are presented in Figure 8a. Figure 8b–d give the CDF curves at the three grid distances (

d_{g} \in {0.1, 1, 10}

m).

Like the simulation, the MAEs increase and the CDF curves move down as the grid distance increases. The MAE of the TC method is the highest because some sensor pairs might produce very severe TDOA measurements in noisy acoustic environments. Its CDF curve also shows that the solution is not stable. By comparing the result of CSF and CSB, the band-pass GCC can significantly enhance the SNR and the localization performance. The PES and PEM obtain more significant localization errors and lack robustness, which indicates the influence of the steering TDOA uncertainty is very remarkable. The proposed WR method achieves the best estimation for all the grid distances, which thoroughly verifies its effectiveness.

5. Conclusions

In this work, a novel and robust Steered Response Power (SRP)-based source localization approach is proposed to localize the band-pass source in outdoor WASNs with steering time delay uncertainty and coarser spatial grids. The robustness of on-grid source localization is analyzed by a sufficient condition, in which the relation between GCC signal waveform and on-grid localization error is demonstrated. A band-pass GCC refinement procedure is designed to meet the sufficient condition for enhancing the on-grid source localization performance. The Monte Carlo simulation and field experiment show that the proposed method has a robust performance in outdoor WASNs scenarios, compared with some state-of-the-art SRP-based methods.

Author Contributions

Conceptualization, methodology, programming, writing—original draft preparation, Y.H.; conceptualization, writing—review and editing, data curation, J.T.; writing—review and editing, X.H.; supervision, M.B. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China under Grant (11774379,61501448), and Youth Innovation Promotion Association.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Publicly available datasets were analyzed in this study. This data can be found here: https://1drv.ms/u/s!AskSoQGpB3VUgfIqsxtYhosVrGyzOg?e=pnfutC.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

SRP	steered response power
TDOA	time difference of arrival
DOA	direction of arrival
GCC	generalized cross-correlation
PHAT	phase transform
CDF	cumulative distribution function
GPS	Global Position System
FFT	Fast Fourier Transform

Appendix A

Appendix A.1

Proposition A1.

if

M (α, x_{s}) \cap X (d_{g}, x_{g}^{o}) \neq \emptyset

and

M (α, x_{s})

is a bounded set (i.e., there exists a

ε_{M} \in (0, \infty)

such that

∥ x_{1} - x_{2} ∥ \leq ε_{M}

for all

x_{1}, x_{2} \in M (α, x_{s})

), then Inequality (18) will be satisfied.

Proof of Proposition A1.

For an arbitrary

x_{g}^{o}

, if

M (α, x_{s}) \cap X (d_{g}, x_{g}^{o}) \neq \emptyset

, there exists an

x_{a}

such that

x_{a} \in M (α, x_{s}) \cap X (d_{g}, x_{g}^{o})

. Let

{\hat{x}}_{s}^{g}

be the estimated result from Equation (17). Then

F_{E} ({\hat{x}}_{s}^{g}, x_{s}) \geq F_{E} (x_{a}, x_{s}) \geq α .

According to the definition of

M (α, x_{s})

,

{\hat{x}}_{s}^{g} \in M (α, x_{s})

holds. Since

M (α, x_{s})

is a bounded set,

∥ x_{a} ∥ < \infty

. Then

∥ x_{s} - x_{a} ∥

is finite. Denote

ε_{M} \in (0, \infty)

be a bound of

M (α, x_{s})

and let

ε = ε_{M} + ∥ x_{s} - x_{a} ∥ \in (0, \infty)

. Then

∥ {\hat{x}}_{s}^{g} - x_{s} ∥ \leq ∥ {\hat{x}}_{s}^{g} - x_{a} ∥ + ∥ x_{a} - x_{s} ∥ \leq ε .

□

Appendix A.2

Proposition A2.

If a closed ball

B^{N} (x_{o}, r)

such that

r \geq d_{g} \sqrt{N} / 2

, then for all

x_{g}^{o} \in R^{N}

,

B^{N} (x_{o}, r) ⋂ X (d_{g}, x_{g}^{o}) \neq \emptyset

holds.

Proof of Proposition A2.

Let

B^{N} (x_{o}, r)

be a closed ball with center

x_{o}

and radius r. For an arbitrary

x_{g}^{o} \in R^{N}

, the vector from

x_{o}

to

x_{g}^{o}

is denoted as

Δ x^{o} = x_{o} - x_{g}^{o} = {[Δ x_{1}^{o}, \dots, Δ x_{N}^{o}]}^{T} .

Given

d_{g} \in R^{+}

, it deduces

n_{k}^{o} = 〈 \frac{Δ x_{k}^{o}}{d_{g}} 〉

(k = 1,...,N), where “

〈 . 〉

” means the nearest integer. Therefore, we can find the grid point

x_{g}^{n} = x_{g}^{o} + {[n_{1}^{o} d_{g}, \dots, n_{N}^{o} d_{g}]}^{T} \in X (d_{g}, x_{g}^{o})

, so that

x_{o} - x_{g}^{n} = {[Δ x_{1}^{o} - n_{1}^{o} d_{g}, \dots, Δ x_{N}^{o} - n_{N}^{o} d_{g}]}^{T}

. The distance yields

∥ x_{g}^{n} - x_{o} ∥ \leq \sqrt{\sum_{i = 1}^{N} {(\frac{d_{g}}{2})}^{2}} = \frac{\sqrt{N} d_{g}}{2} .

Thus, if

r \geq \sqrt{N} d_{g} / 2

, then

x_{g}^{n} \in B^{N} (x_{o}, r)

. Hence,

X (d_{g}, x_{g}^{o}) \cap B^{N} (x_{o}, r) \neq \emptyset

holds. □

Appendix A.3

Proposition A3.

If the waveform function

R_{0} (τ)

such that

T_{R} (α) \geq 2 r / v_{s} + Δ τ_{m a x}

, then

B^{N} (x_{s}, r) \subset M (α, x_{s})

.

Proof of Proposition A3.

Based on Equation (4), it derives that

\begin{matrix} | τ_{p} (x) - τ_{p} (x_{s}) | & = | η_{m} (x) - η_{l} (x) - η_{m} (x_{s}) + η_{l} (x_{s}) | \\ \leq | η_{m} (x) - η_{m} (x_{s}) | + | η_{l} (x_{s}) - η_{l} (x) | \\ = \frac{| ∥ x - z_{m} ∥ - ∥ x_{s} - z_{m} ∥ | + | ∥ x - z_{l} ∥ - ∥ x_{s} - z_{l} ∥ |}{v_{s}} \\ \leq 2 ∥ x - x_{s} ∥ / v_{s} \end{matrix}

Given the steering TDOA uncertainty level

Δ τ_{m a x}

, for each

x \in B^{N} (x_{s}, r)

, the steering TDOA function

τ_{p} (x)

derives that

\begin{matrix} | τ_{p} (x) - τ_{p}^{0} (x_{s}) | & = | τ_{p} (x) - τ_{p} (x_{s}) + Δ τ_{p} (x_{s}) | \\ \leq | τ_{p} (x) - τ_{p} (x_{s}) | + | Δ τ_{p} (x_{s}) | \\ \leq 2 ∥ x - x_{s} ∥ / v_{s} + Δ τ_{m a x} \\ \leq 2 r / v_{s} + Δ τ_{m a x} . \end{matrix}

Since

T_{R} (α) \geq 2 r / v_{s} + Δ τ_{m a x}

, according Equation (22), it derives that

R_{p} (τ_{p} (x)) = R_{0} (τ_{p} (x) - τ_{p}^{0} (x_{s})) \geq α

holds for all

c_{p}

. According to Equation (15), then for every

x \in B^{N} (x_{s}, r)

, the inequality

F_{E} (x, x_{s}) \geq α

holds. According to Equation (19),

B^{N} (x_{s}, r) \subseteq M (α, x_{s})

holds. □

Appendix A.4

Proposition A4.

If for all two different pairs of sensors

c_{i} = {i_{l}, i_{m}}

,

c_{j} = {j_{l}, j_{m}}

in the WASNs satisfy that

\forall τ_{i}^{c} \in [- ∥ z_{i_{l}} - z_{i_{m}} ∥, ∥ z_{i_{l}} - z_{i_{m}} ∥] / v_{s}

and

\forall τ_{j}^{c} \in [- ∥ z_{j_{l}} - z_{j_{m}} ∥, ∥ z_{j_{l}} - z_{j_{m}} ∥] / v_{s}

,

Λ_{i} (τ_{i}^{c}, 0) ⊈ Λ_{j} (τ_{j}^{c}, 0)

and

Λ_{i} (τ_{i}^{c}, 0) ⊉ Λ_{j} (τ_{j}^{c}, 0)

, then

max_{{∥ x ∥ = + \infty, ∥ x_{s} ∥ < + \infty}} \{F_{E} (x, x_{s})\} \leq \frac{C_{N}^{2} a_{m} + (C_{M}^{2} - C_{N}^{2}) a_{s}}{C_{M}^{2}}

holds.

Proof of Proposition A4.

For a spatial point

x

such that

∥ x ∥ = \infty

, let

K \in N

be the total number of sensor pairs

c_{p}

such that

x \in Λ_{p} (τ_{p}^{0} (x_{s}), T_{R} (a_{s}))

. According to Equation (15) and Inequality (22), it follows that

F_{E} (x, x_{s}) \leq \frac{K a_{m} + (C_{M} - K) a_{s}}{C_{M}^{2}} .

(A1)

If

K \geq C_{N}^{2} + 1

, there exists a collection of N linear independent sensor pairs from those

(C_{N}^{2} + 1)

sensor pairs. Without the loss of generality, denote this collection as

{c_{1}, \dots, c_{N}}

. Then for each

x_{d} \in ⋂_{p = 1}^{N} Λ_{p} (τ_{p}^{0} (x_{s}), T_{R} (a_{s}))

, there exists an equation set such that:

\{\begin{matrix} τ_{1} (x_{d}) = τ_{1}^{c}, \\ τ_{2} (x_{d}) = τ_{2}^{c}, \\ \dots \\ τ_{N} (x_{d}) = τ_{N}^{c}, \end{matrix}

where

τ_{N}^{c} \in [τ_{p}^{0} (x_{s}) - T_{R} (a_{s}), τ_{p}^{0} (x_{s}) + T_{R} (a_{s})]

. According to the condition of the Proposition A4 and since the sensor pairs are all linear independent, these N equations are linear independent. Then it holds that

∥ x_{d} ∥ \neq \infty

which is in contradiction with

∥ x ∥ = \infty

. Thus

K \leq C_{N}^{2}

. According to Inequality (A1), it is easily obtain that

F_{E} (x, x_{s}) \leq (C_{N}^{2} a_{m} + (C_{M}^{2} - C_{N}^{2}) a_{s}) / C_{M}^{2}

. □

References

Ajdler, T.; Kozintsev, I.; Lienhart, R.; Vetterli, M. Acoustic Source Localization in Distributed Sensor Networks. In Proceedings of the Conference Record of the Thirty-Eighth Asilomar Conference on Signals, Systems and Computers, 2004, Pacific Grove, CA, USA, 7–10 November 2004; Volume 2, pp. 1328–1332. [Google Scholar]
Liu, Y.; Hu, Y.H.; Pan, Q. Robust Maximum Likelihood Acoustic Source Localization in Wireless Sensor Networks. In Proceedings of the GLOBECOM 2009-2009 IEEE Global Telecommunications Conference, Honolulu, HI, USA, 30 November–4 December 2009; pp. 1–6. [Google Scholar]
Saric, Z.; Kukolj, D.; Teslic, N. Acoustic Source Localization in Wireless Sensor Network. Circuits Syst. Signal Process. 2010, 29, 837–856. [Google Scholar] [CrossRef]
Kim, Y.; Ahn, J.; Cha, H. Locating acoustic events based on large-scale sensor networks. Sensors 2009, 9, 9925–9944. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Cobos, M.; Antonacci, F.; Alexandridis, A.; Mouchtaris, A.; Lee, B. A survey of sound source localization methods in wireless acoustic sensor networks. Wirel. Commun. Mob. Comput. 2017, 2017. [Google Scholar] [CrossRef]
Sheng, X.; Hu, Y.H. Sequential acoustic energy based source localization using particle filter in a distributed sensor network. In Proceedings of the 2004 IEEE International Conference on Acoustics, Speech and Signal Processing, Montreal, QC, Canada, 17–21 May 2004; Volume 3, pp. 972–975. [Google Scholar]
Sheng, X.; Hu, Y.H. Maximum likelihood multiple-source localization using acoustic energy measurements with wireless sensor networks. Signal Process. IEEE Trans. 2005, 53, 44–53. [Google Scholar] [CrossRef] [Green Version]
Meng, W.; Xiao, W. Energy-based acoustic source localization methods: A survey. Sensors 2017, 17, 376. [Google Scholar] [CrossRef] [Green Version]
Chang, S.; Li, Y.; He, Y.; Wu, Y. RSS-based target localization in underwater acoustic sensor networks via convex relaxation. Sensors 2019, 19, 2323. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Berman, Z. A reliable maximum likelihood algorithm for bearing-only target motion analysis. In Proceedings of the 36th IEEE Conference on Decision and Control, San Diego, CA, USA, 12 December 1997; Volume 5, pp. 5012–5017. [Google Scholar]
Doğançay, K. Bearings-only target localization using total least squares. Signal Process. 2005, 85, 1695–1710. [Google Scholar] [CrossRef]
Navidi, W.; Murphy, W.; Hereman, W. Statistical Methods in Surveying by Trilateration. Comput. Stat. Data Anal. 1998, 27, 209–227. [Google Scholar] [CrossRef]
Chan, Y.; Ho, K. A Simple and Efficient Estimator for Hyperbolic Location. Signal Process. IEEE Trans. 1994, 42, 1905–1915. [Google Scholar] [CrossRef] [Green Version]
Gillette, M.; Silverman, H. A Linear Closed-Form Algorithm for Source Localization From Time-Differences of Arrival. Signal Process. Lett. IEEE 2008, 15, 1–4. [Google Scholar] [CrossRef]
Bordoy, J.; Schott, D.J.; Xie, J.; Bannoura, A.; Klein, P.; Striet, L.; Hoeflinger, F.; Haering, I.; Reindl, L.; Schindelhauer, C. Acoustic Indoor Localization Augmentation by Self-Calibration and Machine Learning. Sensors 2020, 20, 1177. [Google Scholar] [CrossRef] [PubMed] [Green Version]
DiBiase, J.H.; Silverman, H.F.; Brandstein, M.S. Robust localization in reverberant rooms. In Microphone Arrays; Springer: Berlin/Heidelberg, Germany, 2001; pp. 157–180. [Google Scholar]
Do, H.; Silverman, H.; Yu, Y. A Real-Time SRP-PHAT Source Location Implementation using Stochastic Region Contraction(SRC) on a Large-Aperture Microphone Array. In Proceedings of the 2007 IEEE International Conference on Acoustics, Speech and Signal Processing—ICASSP ’07, Honolulu, HI, USA, 15–20 April 2007; Volume 1, pp. 121–124. [Google Scholar]
Cobos, M.; Marti, A.; Lopez, J.J. A Modified SRP-PHAT Functional for Robust Real-Time Sound Source Localization With Scalable Spatial Sampling. IEEE Signal Process. Lett. 2011, 18, 71–74. [Google Scholar] [CrossRef]
Marti, A.; Cobos, M.; Lopez, J.; Escolano, J. A steered response power iterative method for high-accuracy acoustic source localization. J. Acoust. Soc. Am. 2013, 134, 2627–2630. [Google Scholar] [CrossRef]
Traa, J.; Wingate, D.; Stein, N.; Smaragdis, P. Robust Source Localization and Enhancement With a Probabilistic Steered Response Power Model. IEEE/ACM Trans. Audio Speech Lang. Process. 2015, 24, 1. [Google Scholar] [CrossRef]
Cobos, M.; Garcia-Pineda, M.; Arevalillo-Herráez, M. Steered Response Power Localization of Acoustic Pass-Band Signals. IEEE Signal Process. Lett. 2017, 24, 717–721. [Google Scholar] [CrossRef]
Ritu; Dhull, S. Iterative Volumetric Reduction (IVR) Steered Response Power Method for Acoustic Source Localization. Int. J. Sens. Wirel. Commun. Control 2020, 10. [Google Scholar] [CrossRef]
Knapp, C.; Carter, G. The Generalized Correlation Method for Estimation of Time Delay. Acoust. Speech Signal Process. IEEE Trans. 1976, 24, 320–327. [Google Scholar] [CrossRef] [Green Version]
Brutti, A.; Omologo, M.; Svaizer, P. Speaker localization based on oriented global coherence field. In Proceedings of the Ninth International Conference on Spoken Language Processing, Pittsburgh, PA, USA, 17–21 September 2006. [Google Scholar]
Brutti, A.; Omologo, M.; Svaizer, P. Localization of multiple speakers based on a two step acoustic map analysis. In Proceedings of the 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, Las Vegas, NV, USA, 30 March–4 April 2008; pp. 4349–4352. [Google Scholar]
Salvati, D.; Drioli, C.; Foresti, G. Exploiting a geometrically sampled grid in the steered response power algorithm for localization improvement. J. Acoust. Soc. Am. 2017, 141, 586–601. [Google Scholar] [CrossRef] [Green Version]
Zotkin, D.N.; Duraiswami, R. Accelerated speech source localization via a hierarchical search of steered response power. IEEE Trans. Speech Audio Process. 2004, 12, 499–508. [Google Scholar] [CrossRef]
Khanal, S.; Silverman, H.F. Multi-stage rejection sampling (MSRS): A robust SRP-PHAT peak detection algorithm for localization of cocktail-party talkers. In Proceedings of the 2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NY, USA, 18–21 October 2015; pp. 1–5. [Google Scholar]
Nunes, L.O.; Martins, W.A.; Lima, M.V.; Biscainho, L.W.; Gonçalves, F.M.; Said, A.; Lee, B. A steered-response power algorithm employing hierarchical search for acoustic source localization using microphone arrays. IEEE Trans. Signal Process. 2014, 62, 5171–5183. [Google Scholar] [CrossRef]
Cobos, M.; Antonacci, F.; Comanducci, L.; Sarti, A. Frequency-Sliding Generalized Cross-Correlation: A Sub-Band Time Delay Estimation Approach. IEEE/ACM Trans. Audio Speech Lang. Process. 2020, 28, 1270–1281. [Google Scholar] [CrossRef] [Green Version]
Tian, Z.; Liu, W.; Ru, X. Multi-Target Localization and Tracking Using TDOA and AOA Measurements Based on Gibbs-GLMB Filtering. Sensors 2019, 19, 5437. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Kaplan, L.; Le, Q.; Molnár, N. Maximum likelihood methods for bearings-only target localization. In Proceedings of the 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing, Salt Lake City, UT, USA, 7–11 May 2001; Volume 5, pp. 3001–3004. [Google Scholar]
Griffin, A.; Alexandridis, A.; Pavlidi, D.; Mastorakis, Y.; Mouchtaris, A. Localizing multiple audio sources in a wireless acoustic sensor network. Signal Process. 2015, 107, 54–67. [Google Scholar] [CrossRef]

Figure 1. Comparison of the ideal steered response power (SRP)-based source localization in an ideal case and with the unexpected effects (the symbols “o” and “+” represent the source position and the estimated position, respectively): (a) SRP map (3D view); (b) Ideal SRP map (2D view); (c) defocus effect from steering time uncertainties; (d) undersampled effect from coarse grid; (e) rippling effect from band-pass generalized cross-correlations (GCCs); (f) combined effect.

Figure 2. Illustration of the level-pass area

M (α, x_{s})

. (Orange:

M (0.3, x_{s})

; yellow green:

M (0.2, x_{s})

; celeste:

M (0.1, x_{s})

).

Figure 2. Illustration of the level-pass area

M (α, x_{s})

. (Orange:

M (0.3, x_{s})

; yellow green:

M (0.2, x_{s})

; celeste:

M (0.1, x_{s})

).

Figure 3. An example of

R_{0} (τ)

.

Figure 3. An example of

R_{0} (τ)

.

Figure 4. An example of refined GCC from field data: (a) GCC-Phase Transform (PHAT); (b) refined GCC.

Figure 5. Simulation comparison in the cumulative distribution function (CDF) of relative distance error. (a) small steering time difference of arrival (TDOA) uncertainty and small grid distance (STSG); (b) large steering TDOA uncertainty and small grid distance (LTSG); (c) small steering TDOA uncertainty and large grid distance (STLG); (d) large steering TDOA uncertainty and large grid distance (LTLG).

Figure 6. The mean absolute errors (MAEs) under different conditions. (a) small steering TDOA uncertainty (ST) (

Δ τ_{m a x}

= 0.1 ms,

d_{g} \in [0.1

m,

50

m]); (b) large steering TDOA uncertainty level (LT) (

Δ τ_{m a x}

= 100 ms,

d_{g} \in [0.1

m,

50

m]); (c) small grid distance (SG) (

d_{g}

= 0.1 m,

Δ τ_{m a x} \in [0.1

ms,

100

ms]); (d) large grid distance (LG) (

d_{g}

= 10 m,

Δ τ_{m a x} \in

(0.1 ms,100 ms)).

Figure 6. The mean absolute errors (MAEs) under different conditions. (a) small steering TDOA uncertainty (ST) (

Δ τ_{m a x}

= 0.1 ms,

d_{g} \in [0.1

m,

50

m]); (b) large steering TDOA uncertainty level (LT) (

Δ τ_{m a x}

= 100 ms,

d_{g} \in [0.1

m,

50

m]); (c) small grid distance (SG) (

d_{g}

= 0.1 m,

Δ τ_{m a x} \in [0.1

ms,

100

ms]); (d) large grid distance (LG) (

d_{g}

= 10 m,

Δ τ_{m a x} \in

(0.1 ms,100 ms)).

Figure 7. Setup of the field experiment (a) Device. (b) Distribution. (c) Estimated power spectrum density of sensor signal 30 m away from source. (d) Estimated signal to noise ratio.

Figure 8. Experiment results: (a) MAE comparison; (b) CDF of relative error at

d_{g} = 0.1

m; (c) CDF of relative error at

d_{g} = 1

m; (d) CDF of relative error at

d_{g} = 10

m.

Figure 8. Experiment results: (a) MAE comparison; (b) CDF of relative error at

d_{g} = 0.1

m; (c) CDF of relative error at

d_{g} = 1

m; (d) CDF of relative error at

d_{g} = 10

m.

Table 1. Mean absolute error (MAE) and 95th percentile under different conditions in the simulation.

MAE (m)
Condition	PS	PES	PM	PEM	WR
STSG	0.81	0.07	1.01	0.07	0.06
LTSG	44.53	29.27	52.04	36.37	13.16
STLG	51.90	15.39	42.97	4.07	4.46
LTLG	77.64	50.74	70.37	22.88	13.65
95th percentile (m)
Condition	PS	PES	PM	PEM	WR
STSG	2.83	0.17	2.99	0.18	0.17
LTSG	123.13	82.61	128.10	118.61	33.43
STLG	147.04	58.81	124.39	7.11	9.24
LTLG	172.37	139.73	163.95	74.07	34.68

Table 2. Mean absolute error (MAE) and 95th percentile under different conditions in the field experiment.

MAE (m)
Condition	TC	CSF	CSB	PES	PEM	WR
no grid	102.2	-	-	-	-	-
$d_{g}$ = 0.1 m	-	79.2	23.5	7.1	18.7	1.4
$d_{g}$ = 1 m	-	83.0	33.0	12.6	27.4	2.0
$d_{g}$ = 10 m	-	93.3	66.0	42.6	46.1	7.2
95th percentile (m)
Condition	TC	CSF	CSB	PES	PEM	WR
no grid	322.8	-	-	-	-	-
$d_{g}$ = 0.1 m	-	146.5	100.8	53.7	105.0	5.4
$d_{g}$ = 1 m	-	150.4	113.1	91.6	105.1	6.0
$d_{g}$ = 10 m	-	171.8	149.0	138.5	104.6	21.0

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Huang, Y.; Tong, J.; Hu, X.; Bao, M. A Robust Steered Response Power Localization Method for Wireless Acoustic Sensor Networks in an Outdoor Environment. Sensors 2021, 21, 1591. https://doi.org/10.3390/s21051591

AMA Style

Huang Y, Tong J, Hu X, Bao M. A Robust Steered Response Power Localization Method for Wireless Acoustic Sensor Networks in an Outdoor Environment. Sensors. 2021; 21(5):1591. https://doi.org/10.3390/s21051591

Chicago/Turabian Style

Huang, Yiwei, Jianfei Tong, Xiaoqing Hu, and Ming Bao. 2021. "A Robust Steered Response Power Localization Method for Wireless Acoustic Sensor Networks in an Outdoor Environment" Sensors 21, no. 5: 1591. https://doi.org/10.3390/s21051591

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Robust Steered Response Power Localization Method for Wireless Acoustic Sensor Networks in an Outdoor Environment

Abstract

1. Introduction

2. SRP-Based Localization in Outdoor Acoustic Sensor Network

2.1. System Models

2.2. Problem Formulation

3. A Robust Outdoor SRP-Based Source Localization Method

3.1. On-Grid SRP-Based Localization Error Bound Condition

3.2. Robust SRP-Based Source Localization with Refined GCC Waveform

4. Experiment Results and Discussion

4.1. Numerical Simulations

4.2. Field Experiment

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A

Appendix A.1

Appendix A.2

Appendix A.3

Appendix A.4

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI