1 Introduction

In the field of array signal processing, the direction-of-arrival (DOA) estimation accuracy of the incident sources is proportional to the aperture of the antenna array, and therefore an array with a larger aperture is desired [1]. However, to avoid the phase ambiguity in DOA estimation, it is generally believed that the spacing between adjacent antennas should not be greater than λ/2, where λ denotes the wavelength of the incident signal [1, 2]. In this way, a large aperture array usually requires more antennas and thus increases the cost as well as the mutual coupling between antennas. In order to mitigate this issue, various sparse array configurations and the corresponding DOA estimation algorithms have been developed. One type of sparse array is constructed by multiple widely separated sub-arrays [35], and the corresponding estimation of signal parameters via rotation invariant technique (ESPRIT)-based algorithms which used the dual-size or multiple-size invariance within these arrays were developed therein. Another type is designed to obtain as many as degrees-of-freedom (DOFs) to resolve more sources than sensors, such as the minimum-redundancy array [6], the nested array [7], and the co-prime array [8]. Their DOA estimation algorithms focused on using the high order statistic characteristics of the received data of the sparse array to increase the number of DOF and thus often required a large computational workload.

In the meantime, the electromagnetic-vector-sensor (EMVS) [9] has received extensive attention in array signal processing recently as well as other polarization antenna arrays [1016]. EMVS can not only provide the DOA estimation of the signal, but can also give the polarization information. An EMVS usually consists of three orthogonally oriented dipoles and three orthogonally oriented loops to measure the electric field and the magnetic field of the incident source [17]. Unfortunately, due to the collocated geometry, the mutual coupling across the EMVS components affects the performance of the algorithm severely. In 2011, Wong and Yuan [18] proposed a SS-EMVS which consists of six orthogonally oriented but spatially non-collocating dipoles and loops. This SS-EMVS reduces the mutual coupling between antenna components, and the developed algorithm retains the effectiveness of the vector-cross-product algorithm [9]. Following this, various spatially spread polarization antenna arrays have been proposed [1923]. Li et al. [24] presented many geometry configurations of the SS-EMVS and a nonlinear programming-based DOA estimation algorithm. Yuan [25] proposed the way how the four/five spatial noncollocated dipoles/loops were placed to estimate multi-source azimuth/elevation direction finding and polarization. The array configuration of the SS-EMVS was further investigated in [11, 26].

Most recently, there are some research on the combination of EMVS and sparse array and the corresponding parameter estimation algorithms. For example, Han et al. [27] developed a nested vector-sensor array, He et al. [28] proposed a nested cross-dipole array, and Rao et al. [29] proposed a new class of sparse vector-sensor arrays. Various compositions of sparse acoustic vector-sensor arrays to estimate the elevation-azimuth angles of coherent sources were presented in [30]. In [21], we proposed a multi-scale sparse array with each sensor unit consisting of one SS-EMVS, which is capable of estimating the 2D directions and polarization information of the source simultaneously. However, the estimation accuracy for one of the two direction cosines is limited (by the aperture of a single SS-EMVS) since the sparse array is only extended along one axis. Furthermore, the unit of the aforementioned array is a six-component SS-EMVS, and therefore, the cost and redundancy of the whole array are still high.

In order to tackle the limitation of the sparse array developed in [21], in this paper, we propose a new array geometry composed with multi-scale scalar arrays and a single triangular SS-EMVS, and develop the corresponding 2-D DOA estimation algorithm. The proposed array consists of an L-shaped scalar array and a triangular SS-EMVS. The two arms of the L-shaped scalar array are connected by a triangular SS-EMVS, which is placed in such a way that the vector-cross-product algorithm can be applied on it for DOA estimation. The scalar sensors in each arm of the L-shaped array can be divided into two uniform linear sub-arrays with different inter-sensor spacings. Owing to the spatially spread geometry of the SS-EMVS and the different inter-sensor spacings of the two sub-arrays, we can obtain multiple estimates of target parameters. From the SS-EMVS, we can obtain an unambiguous but low-accuracy estimates and a relatively high-accuracy but ambiguous estimates of incident sources using the vector-cross-product algorithm [18]. In addition, we can obtain two high-accuracy but cyclically ambiguous estimates of desired direction cosines by applying the ESPRIT algorithm to the corresponding two sub-arrays in the L-shaped array, respectively. Following this, we develop a three-order disambiguation method to obtain the final high-accuracy and unambiguous estimates of target DOA.

The proposed array integrates the advantages of sparse (scalar) array and SS-EMVS in reducing mutual coupling and achieving high-accuracy DOA estimation. Moreover, we only use a single SS-EMVS along with the L-shaped scalar array to achieve high-accuracy DOA estimation, and thus the cost, the redundancy of the proposed array, and the computational workload of the corresponding DOA estimation algorithm decrease significantly.

The rest of this paper is organized as follows. Section 2 describes the proposed array geometry. Section 3 develops the proposed algorithm for DOA estimation. In Section 4, numerical examples are provided to show the effectiveness and advantages of the proposed array and algorithm. Section 5 concludes the paper.

2 Array geometry

2.1 Triangular spatially-spread electromagnetic-vector-sensor

Figure 1 depicts the array configuration for the triangular SS-EMVS used in our paper, where one dipole ey is placed at the origin (of the Cartesian coordinate system) and the other two dipoles are placed along x-axis and y-axis. The distance between ex and ey is Δx,y, and the distance between ey and ez is Δy,z. The loops of the SS-EMVS are placed in such a way that the vector-cross-product algorithm can be adopted for DOA estimation, i.e., \(\overrightarrow {e_{y}e_{x}}=-\overrightarrow {h_{y}h_{x}}\) and \(\overrightarrow {e_{y}e_{z}}=-\overrightarrow {h_{y}h_{z}}\) [11], where \(\overrightarrow {xy}\) denotes a vector from point x to point y and hy is located at (xh,yh,zh). The positions of the three dipoles and the three loops form two right-angled triangles, and thus we name it as the triangular SS-EMVS. It is worth noting that both Δx,y and Δy,z can be larger than a half-wavelength of the signal. Therefore, the SS-EMVS itself is a sparse array.

Fig. 1
figure 1

Configuration of the triangular SS-EMVS [11]. The source is located at elevation angle θ and azimuth angle ϕ

Besides, the configuration of the SS-EMVS used in [21] is based on two parallel lines. It can only expand in one direction; the estimation accuracy for another direction cosine is limited. By contrast, the triangular SS-EMVS depicted in Fig. 1 has two direction extensions. Therefore, this configuration can provide relatively higher accuracy direction-cosine estimates for the two direction consines along the x- and y-axis, respectively, and thus higher accuracy estimates for θ (elevation angle) and ϕ (azimuth angle) through the vector-cross-product algorithm. Thereby, it is reasonable to use this configuration of SS-EMVS to extend the aperture of the array by constructing a 2D L-shaped array.

Consider a far-field source, located at elevation angle θ∈[0,π] and azimuth angle ϕ∈[0,2π), with polarization parameters (γ,η), where γ refers to the auxiliary polarization angle and η represents the polarization phase difference. The array manifold of the triangular SS-EMVS in Fig. 1, a, can be denoted by the electric-field vector e=[ex,ey,ez]T and the magnetic-field vector h=[hx,hy,hz]T by taking account of the inter-dipole/loop spacings {Δx,y,Δy,z},

$$ \boldsymbol{a} = \left[\begin{array}{c} 1\\ e^{j\frac{2\pi}{\lambda}\Delta_{x,y} v}\\ e^{j\frac{2\pi}{\lambda}(\Delta_{x,y} v - \Delta_{y,z} u)}\\ e^{-j\frac{2\pi}{\lambda}(x_{h}u + y_{h} v+ z_{h} w - 2\Delta_{x,y} v)}\\ e^{-j\frac{2\pi}{\lambda}[(x_{h}u + y_{h} v+ z_{h} w) - \Delta_{x,y} v]}\\ e^{-j\frac{2\pi}{\lambda}[(x_{h}u + y_{h} v+ z_{h} w) - (\Delta_{x,y} v + \Delta_{y,z} u)]} \end{array} \right]\odot\left[\begin{array}{c} e_{x}\\ e_{y}\\ e_{z}\\ h_{x}\\ h_{y}\\ h_{z} \end{array}\right], $$
(1)

where

$$ \left[\begin{array}{c} e_{x}\\ e_{y}\\ e_{z}\\ h_{x}\\ h_{y}\\ h_{z} \end{array}\right] = \left[\begin{array}{cc} \cos\phi\cos\theta & -\sin\phi\\ \sin\phi\cos\theta & \cos\phi\\ -\sin\theta & 0\\ -\sin\phi & -\cos\phi\cos\theta\\ \cos\phi & -\sin\phi\cos\theta\\ 0 & \sin\theta \end{array}\right] \left[\begin{array}{c} \sin\gamma e^{j\eta}\\\cos\gamma \end{array}\right], $$
(2)

and λ represents the wavelength of the signal, the superscript (.)T is the transposition operator, ⊙ denotes Hadamard (element-wise) product, \(j=\sqrt {-1}\), and

$$\begin{array}{@{}rcl@{}} \left\{ \begin{array}{ccl} u &=& \sin\theta\cos\phi\\ v &=& \sin\theta\sin\phi\\ w &=& \cos\theta \end{array}\right. \end{array} $$
(3)

represent the direction cosines along the x-, y-, and z-axis, respectively.

2.2 Design of proposed array

Figure 2 demonstrates our proposed array configuration composed of an L-shaped sparse scalar array and a single triangular SS-EMVS. The triangular SS-EMVS is located at the origin and two scalar arrays are placed along the x-axis and y-axis, respectively. The antennas on the two arms of the L-shaped array are oriented differently, i.e., along with ex and ez, respectively. Each arm of the L-shaped array consists of two sub-arrays. Taking the arm along with the y-axis as an example, the first sub-array, which consists of the first n1 dipoles (including the ex in the triangular SS-EMVS located at the origin), is placed with inter-sensor spacing D1>Δx,yλ/2; the second sub-array, which consists of the last n2 dipoles, is placed with an even larger inter-sensor spacing D2=m1D1, where m1 is an integer. Futhermore, we can see that the first sub-array and the triangular SS-EMVS share a same ex. Similarly, the arm of the L-shaped array along with the x-axis consists of two sub-arrays with inter-sensor spacings D3>Δy,zλ/2 and D4=m2D3, respectively, where m2 is an integer; the first sub-array and the triangular SS-EMVS share a same ez. It should be noticed that the dipoles placed along the x-axis and the y-axis are of different orientations, and they are the same as the dipoles of the triangular SS-EMVS along with the corresponding axis.

Fig. 2
figure 2

The proposed array configuration. The triangular SS-EMVS in Fig. 1 is located at the origin.The scalar array whose unit is ex is extended along y-axis. The inter-sensor spacing in sub-array 1 is D1 and the inter-sensor spacing in sub-array 2 is D2, respectively, where D2=m1D1 and D1>Δx,yλ/2. Similarly, the scalar array whose unit is ez is extended along x-axis. The inter-sensor spacing in sub-array 3 is D3 and the inter-sensor spacing in sub-array 4 is D4, respectively, where D4=m2D3 and D3>Δy,zλ/2

We take the scalar array placed along the y-axis as the example again to illustrate the design idea of the proposed array. The triangular SS-EMVS can provide a coarse estimate of v by applying the vector-cross-product algorithm. And the estimation result can be used as a reference for solving the ambiguity problem of the v estimate from the first sub-array in the scalar array. Therefore, the inter-sensor spacing of the first sub-array can be larger than λ/2. The aperture of the first sub-array is much larger than the inter-dipole/loop spacings of the triangular SS-EMVS and thus we can obtain a finer estimate of the v with the first sub-array. Similarly, the disambiguated estimation result of the first sub-array can be adopted as the reference of the second sub-array and finally a high-accuracy v estimation result is obtained. Similarly, we can use the same method for the scalar array placed along the x-axis to obtain a high-accuracy u estimation result. Finally, the high-accuracy angular estimation results can be calculated with the high-accuracy u and v estimations. Besides, the inter-sensor spacings of the two scalar arrays are much larger than λ/2, and thus the apertures and the angle estimation accuracy of the proposed array will be better than the L-shaped array with λ/2 inter-sensor spacing [31] and the L-shaped nested array [32] with λ/2 inter-sensor spacing in the first sub-array. These probabilities will be verified in Section 4 through extensive simulation experiments. In addition, owing to that the scalar arrays are extended along x-axis and y-axis at the same time, 2D high-accuracy DOA estimations can be obtained simultaneously. This can not be reached by the multi-scale EMVS array proposed in [21], where the multi-scale aperture extension is only in one axis. Furthermore, only a single SS-EMVS (along with scalar sensors) instead of many SS-EMVSs are adopted in the proposed array, and thus the cost and redundancy of the array will be decreased dramatically.

2.3 Array manifold and signal model

The array manifold of the scalar array placed along the y-axis is

$$\begin{array}{@{}rcl@{}} \boldsymbol{a}_{y} &\,=\,& \left[\begin{array}{c} \left. \begin{array}{c} 1\\ e^{-j\frac{2\pi}{\lambda}D_{1} v}\\ \vdots\\ e^{-j\frac{2\pi}{\lambda}(n_{1}-1)D_{1} v} \end{array} \right\}n_{1}\\ \left. \begin{array}{c} e^{-j\frac{2\pi}{\lambda}n_{1}D_{1} v}\\ e^{-j\frac{2\pi}{\lambda}(n_{1}D_{1} + D_{2}) v}\\ \vdots\\ e^{-j\frac{2\pi}{\lambda}[n_{1}D_{1} + (n_{2}-1)D_{2}] v} \end{array} \right\}n_{2}\\ \end{array} \right] \!\otimes\! \boldsymbol{a} [1], \end{array} $$
(4)

where ⊗ denotes the Kronecker product, a is defined in Eq. (1), a[1] is the first row of a, and thus \(\boldsymbol {a}_{y} \in {\mathbb C}^{N_{1}\times 1}\) with N1=n1+n2.

Similarly, the array manifold of the scalar array placed along the x-axis is

$$\begin{array}{@{}rcl@{}} \boldsymbol{a}_{x} &\!\!\!\,=\,\!\!\!& \left[\begin{array}{c} \left. \begin{array}{c} 1\\ e^{-j\frac{2\pi}{\lambda}D_{3} u}\\ \vdots\\ e^{-j\frac{2\pi}{\lambda}(n_{3}-1)D_{3} u} \end{array} \right\}n_{3}\\ \left. \begin{array}{c} e^{-j\frac{2\pi}{\lambda}n_{3}D_{3} u}\\ e^{-j\frac{2\pi}{\lambda}(n_{3}D_{3} + D_{4}) u}\\ \vdots\\ e^{-j\frac{2\pi}{\lambda}[n_{3}D_{3} + (n_{4}-1)D_{4}] u} \end{array} \right\}n_{4}\\ \end{array} \right] \!\otimes\! \boldsymbol{a} [3], \end{array} $$
(5)

where a[3] is the third row of a, and \(\boldsymbol {a}_{x} \in {\mathbb C}^{N_{2}\times 1}\) with N2=n3+n4.

Following this, the array manifold of the proposed array is

$$ \boldsymbol{b} = \left[\begin{array}{l} \boldsymbol{a} \\ \boldsymbol{a}_{y}[2:N_{1}]\\ \boldsymbol{a}_{x}[2:N_{2}] \end{array}\right], $$
(6)

where ay[2:N1] consists (N1−1) rows in ay, i.e., from the second row to last row of ay, and ax[2:N2] consists (N2−1) rows in ax, i.e., from the second row to last row of ax. Therefore, \(\boldsymbol {b} \in {\mathbb C}^{N\times 1}\) with N=N1+N2+4.

In a multiple sources scenario with K incident signals, the received data of the proposed sparse array at time t is

$$ \boldsymbol{x}(t) = \sum\limits_{k=1}^{K} \boldsymbol{b}_{k} s_{k}(t) + \boldsymbol{n}(t) = \mathbf{B} \boldsymbol{s}(t) + \boldsymbol{n}(t), $$
(7)

where \(\boldsymbol {b}_{k} \in {\mathbb C}^{N\times 1}\) represents the array manifold of the kth signal and \(\mathbf {B} = [\boldsymbol {b}_{1},\boldsymbol {b}_{2}, \dots, \boldsymbol {b}_{K}] \in {\mathbb C}^{N\times K}\). s(t)=[s1(t),s2(t),…,sK(t)]T denotes the incident signal vector, and n(t) signifies the additive white Gaussian noise.

Considering L time snapshots, we can form the received data matrix

$$ \mathbf{X} = [\boldsymbol{x}(t_{1}),\boldsymbol{x}(t_{2}), \dots, \boldsymbol{x}(t_{L})]. $$
(8)

The following task is to estimate the DOA of these K sources from \(\mathbf {X} \in {\mathbb C}^{N \times L}\), which will be described in detail below.

3 Procedure of multi-scale DOA estimation algorithm

As described in Section 2.2, we can obtain multiple estimates of the direction cosines along y-axis and x-axis by the received data of the triangular SS-EMVS and the two arms of the L-shaped array. However, some of the estimates are cyclically ambiguous and we will use the coarse estimates to disambiguate the ambiguous estimates step by step. The procedure of the entire algorithm is shown in Algorithm 1.

In the following, we give the detailed derivation and progress of the DOA estimation algorithm.

3.1 ESPRIT-based method to estimate the two sets of high-accuracy but cyclically ambiguous v and two sets of high-accuracy but cyclically ambiguous u

The array covariance matrix can be calculated by the maximum likelihood estimation

$$ \hat{\mathbf{R}} = {\frac{1}{L}}\mathbf{X} \mathbf{X}^{H}, $$
(9)

where the superscript H is the Hermitian operator. Following [4], let \(\mathbf {E}_{s} \in {\mathbb C}^{N \times K}\) be the signal subspace matrix composed of the K eigenvectors corresponding to the K largest eigenvalues of \(\hat {\mathbf {R}}\). And Es has the same signal subspace with the manifold matrix B and thus

$$ \mathbf{E}_{s} = \mathbf{B} \mathbf{T}, $$
(10)

where T denotes an unknown K×K non-singular matrix. According to the composition of the proposed array, we divide the manifold matrix B into three parts, i.e., B1, By, and Bx, where \(\mathbf {B}_{1} \in {\mathbb C}^{6 \times K}\) is composed of the top six rows of B (corresponding to the triangular SS-EMVS), \(\mathbf {B}_{y} \in {\mathbb C}^{N_{1} \times K}\) is composed of the first row of B and (N1−1) rows from the seventh row (corresponding to the senors on the y-axis), and \(\mathbf {B}_{x} \in {\mathbb C}^{N_{2} \times K}\) is composed of the third row of B and (N2−1) rows from the (N1+5)th row (corresponding to the senors on the x-axis). In this way, B1, By, and Bx signify the manifold matrices of the SS-EMVS and the two scalar arrays, respectively. Similarly, we can divide the signal subspace matrix Es into three parts with the same method, i.e., \(\mathbf {E}_{s_{1}}\), \(\mathbf {E}_{s_{y}}\), and \(\mathbf {E}_{s_{x}}\). Thus, according to the relationship between array manifold matrix and signal subspace [33] described in Eq. (10), we have

$$\begin{array}{@{}rcl@{}} \mathbf{E}_{s_{1}} &=& \mathbf{B}_{1} \mathbf{T}, \end{array} $$
(11)
$$\begin{array}{@{}rcl@{}} \mathbf{E}_{s_{y}} &=& \mathbf{B}_{y} \mathbf{T}, \end{array} $$
(12)
$$\begin{array}{@{}rcl@{}} \mathbf{E}_{s_{x}} &=& \mathbf{B}_{x} \mathbf{T}. \end{array} $$
(13)

After this, we deal with \(\mathbf {E}_{s_{y}}\) and \(\mathbf {E}_{s_{x}}\) separately to get two sets of the high-accuracy but cyclically ambiguous estimates of v and u. Let us take \(\mathbf {E}_{s_{y}}\) as an example to demonstrate the derivation. According to the different inter-sensor spacings of the scalar array whose unit is ex, we divide the \(\mathbf {E}_{s_{y}}\) into two parts, i.e., \(\mathbf {E}_{s_{y,1}}\) and \(\mathbf {E}_{s_{y,2}}\), where \(\mathbf {E}_{s_{y,1}} \in {\mathbb C}^{n_{1} \times K}\) and \(\mathbf {E}_{s_{y,2}} \in {\mathbb C}^{n_{2} \times K}\) correspond to sub-array 1 and sub-array 2, respectively. Recalling Fig. 2, both sub-array 1 and sub-array 2 are uniform arrays, so the ESPRIT algorithm can be used to \(\mathbf {E}_{s_{y,1}}\) and \(\mathbf {E}_{s_{y,2}}\) to obtain two sets of the high-accuracy but cyclically ambiguous estimates of v, respectively. The process is consistent with that developed in [21]. Since the inter-sensor spacing D1 and D2 are both larger than λ/2, two sets of high-accuracy but cyclically ambiguous y-axis direction cosine estimations \(\hat {v}_{k}^{\text {fine,1}}\) and \(\hat {v}_{k}^{\text {fine,2}}\) can be derived.

In addition, it can be seen from [21, 34] that due to the same column permutation of T, these two sets of v estimations \(\{\hat {v}_{k}^{\text {fine, 1}}\}_{k=1}^{K}\) and \(\{\hat {v}_{k}^{\text {fine, 2}}\}_{k=1}^{K}\) are paired automatically.

Besides, we can obtain two sets of high-accuracy but cyclically ambiguous u estimations, \(\hat {u}_{k}^{\text {fine,1}}\) and \(\hat {u}_{k}^{\text {fine,2}}\) by applying similar process to \(\mathbf {E}_{s_{x}}\). And the u estimations, \(\hat {u}_{k}^{\text {fine,1}}\) and \(\hat {u}_{k}^{\text {fine,2}}\), are also paired automatically.

3.2 Vector-cross-product algorithm to estimate the unambiguous but low-accuracy v and u and the relatively high-accuracy but ambiguous v and u

According to [18, 24], we need to get the estimate of the array manifold in order to apply the vector-cross-product algorithm to process the data of the triangle SS-EMVS. And recalling Eq. (11), we can estimate the manifold matrix of the SS-EMVS with

$$ \hat{\mathbf{B}}_{1} = \mathbf{E}_{s_{1}} \mathbf{T}^{-1}, $$
(14)

where \(\hat {\mathbf {B}}_{1} = \left [\hat {\boldsymbol {a}}_{1}, \dots, \hat {\boldsymbol {a}}_{K}\right ]\) and \(\hat {\boldsymbol {a}}_{k}\) is the estimation of array manifold of kth source at the triangular SS-EMVS.

The following step is to apply the vector-cross-product algorithm to \(\hat {\boldsymbol {a}}_{k}\). For convenience, we set θ∈[0,π/2], ϕ∈[0,π/2), and we omit the source index k and recall Eq. (1), where we have \(\boldsymbol {a} = \left [ {\tilde {\boldsymbol {e}}}^{T}, {\tilde {\boldsymbol {h}}}^{T}\right ]^{T}\) with

$$\begin{array}{@{}rcl@{}} \tilde{\boldsymbol{e}} \!&=& \left[\begin{array}{r} e_{x}\\ e^{j\frac{2\pi}{\lambda}\Delta_{x,y}v} e_{y}\\ e^{j\frac{2\pi}{\lambda}(\Delta_{x,y}v - \Delta_{y,z}u)} e_{z} \end{array} \right], \end{array} $$
(15)
$$\begin{array}{@{}rcl@{}} \tilde{\boldsymbol{h}}\! \!&=&\!\! \left[\begin{array}{r} e^{-j\frac{2\pi}{\lambda}(x_{h}u \,+\, y_{h} v\,+\, z_{h} w - 2\Delta_{x,y}v)} h_{x}\\ e^{-j\frac{2\pi}{\lambda}[(x_{h}u \!+ \!y_{h} v\,+\, z_{h} w) - \Delta_{x,y}v]} h_{y}\\ e^{-j\frac{2\pi}{\lambda}[(x_{h}\!u \,+\, y_{h} \!v\,+\, z_{h}\! w) - (\Delta_{x,y}v \,+\, \Delta_{y,z}u)]}h_{z} \end{array} \right]. \end{array} $$
(16)

According to the vector-cross product algorithm of the triangular SS-EMVS [11], we have,

$$\begin{array}{@{}rcl@{}} \boldsymbol{p} &\!\!\,=\,\!\!& \frac{(\tilde{\boldsymbol{e}})\times (\tilde{\boldsymbol{h}})^{*}}{\|(\tilde{\boldsymbol{e}})\times (\tilde{\boldsymbol{h}})^{*}\|}\\ &\!\!\,=\,\!\!& e^{j\frac{2\pi}{\lambda}(x_{h} u \,+\, y_{h}v \,+\, z_{h} w)}\!\!\!\left[\begin{array}{l} u e^{-j\frac{2\pi}{\lambda}\Delta_{y,z}u}\\ v e^{-j\frac{2\pi}{\lambda}(\Delta_{x,y}v+\Delta_{y,z}u)}\\ w e^{-j\frac{2\pi}{\lambda}\Delta_{x,y}v} \end{array}\right] \end{array} $$
(17)

where × denotes the vector-cross product and p is calculated from \(\hat {\boldsymbol {a}}\).

From the Poynting vector of kth source pk derived in Eq. (17), we can obtain the unambiguous but low-accuracy estimations of {uk,vk,wk} by

$$\begin{array}{@{}rcl@{}} \left\{\begin{array}{ccc} u_{k}^{\text{coarse}} & =& |[\boldsymbol{p}_{k}]_{1}|,\\ v_{k}^{\text{coarse}} & =& |[\boldsymbol{p}_{k}]_{2}|,\\ w_{k}^{\text{coarse}} & =& |[\boldsymbol{p}_{k}]_{3}|, \end{array}\right. \end{array} $$
(18)

where [ ]i extracts ith element of the vector inside [ ], and | | denotes the absolute value of the entity inside | |.

In the following, we estimate the relatively high-accuracy estimation of u and v from the displacement of the dipoles/loops within the triangular SS-EMVS, i.e., Δx,y and Δy,z. From p, we can get

$$ \boldsymbol{p}^{o} = \boldsymbol{p}\odot e^{-\angle[\boldsymbol{p}]_{2}} = \left[\begin{array}{l} u e^{j\frac{2\pi}{\lambda}\Delta_{x,y}v}\\ v \\ w e^{j\frac{2\pi}{\lambda}\Delta_{y,z}u} \end{array}\right], $$
(19)

where ⊙ denotes the Hadamard (element-wise) product. Based on Eq. (19), we have one set of relatively high-accuracy but ambiguous estimations of u and v by

$$\begin{array}{@{}rcl@{}} \hat{u}_{k}^{\text{fine, 0}} &=& \frac{\lambda}{2\pi} \frac{1}{\Delta_{y,z}}\angle\left\{[\boldsymbol{p}_{k}^{o}]_{3}\right\}, \end{array} $$
(20)
$$\begin{array}{@{}rcl@{}} \hat{v}_{k}^{\text{fine, 0}} &=& \frac{\lambda}{2\pi} \frac{1}{\Delta_{x,y}}\angle\left\{[\boldsymbol{p}_{k}^{o}]_{1}\right\}. \end{array} $$
(21)

It is worth mentioning that the unambiguous but low-accuracy estimations \(\{u_{k}^{\text {coarse}}\}_{k=1}^{K}\) and the relatively high-accuracy but ambiguous estimations \(\{\hat {u}_{k}^{\text {fine, 0}}\}_{k=1}^{K}\) are paired automatically, and due to the same T in Eq. (14), all u estimations have been paired, the same for v. Moreover, for θ and ϕ in other angular ranges, the changes are the plus or minus signs in Eqs. (18) and (19) [11].

3.3 Disambiguate the estimations of u and v and calculate the final estimates of θ and ϕ

As can be seen from the above Sections 3.1 and 3.2, for both u and v, there are three sets of high-accuracy but ambiguous estimations and one set of unambiguous but low-accuracy estimation. The three sets of ambiguous estimations correspond to different levels of ambiguity, and a three-order disambiguation method is utilized here.

We take v as the example to demonstrate the derivation and the process for u is similar. Recalling Fig. 2, we know the ambiguity of \(\hat {v}_{k}^{\text {fine,0}}\), \(\hat {v}_{k}^{\text {fine,1}}\), and \(\hat {v}_{k}^{\text {fine,2}}\) correspond to Δx,y, D1, and D2, respectively. And due to the fact that D2>D1>Δx,y, the order of solving ambiguity should be \(\hat {v}_{k}^{\text {fine,0}}\), \(\hat {v}_{k}^{\text {fine,1}}\), \(\hat {v}_{k}^{\text {fine,2}}\) step by step.

3.3.1 Disambiguate \(\hat {v}_{k}^{\text {fine,0}}\) with \(v_{k}^{\text {coarse}}\)

With \(v_{k}^{\text {coarse}}\) as the reference value, the ambiguity of \(\hat {v}_{k}^{\text {fine,0}}\) is solved, and the result can be obtained by

$$\begin{array}{@{}rcl@{}} v_{k}^{\text{fine, 0}} &=& \hat{v}_{k}^{\text{fine, 0}} + \hat{l}_{1} \frac{\lambda}{\Delta_{x,y}}, \end{array} $$
(22)
$$\begin{array}{@{}rcl@{}} \hat{l}_{1} &=& \arg\!\min_{l_{1}}\left|v_{k}^{\text{coarse}} \,-\, \hat{v}_{k}^{\text{fine,0}} \,-\, l_{1}\frac{\lambda}{\Delta_{x,y}}\right|, \end{array} $$
(23)

where \(\left \lceil \left (-1-\hat {v}_{k}^{\text {fine,0}}\right)\frac {\Delta _{x,y}}{\lambda } \right \rceil \le l_{1} \le \left \lfloor \left (1-\hat {v}_{k}^{\text {fine,0}}\right) \frac {\Delta _{x,y}}{\lambda }\right \rfloor \) with ⌈ε⌉ denoting the smallest integer not less than ε and ⌊ε⌋ referring to the largest integer not more than ε [35].

3.3.2 Disambiguate \(\hat {v}_{k}^{\text {fine,1}}\) with \(v_{k}^{\text {fine, 0}}\)

With \(v_{k}^{\text {fine, 0}}\) as the reference value, the ambiguity of \(\hat {v}_{k}^{\text {fine,1}}\) is solved, and the result can be obtained by

$$\begin{array}{@{}rcl@{}} v_{k}^{\text{fine, 1}} &=& \hat{v}_{k}^{\text{fine, 1}} + \hat{l}_{2} \frac{\lambda}{D_{1}}, \end{array} $$
(24)
$$\begin{array}{@{}rcl@{}} \hat{l}_{2} &=& \arg\!\min_{l_{2}}\left|v_{k}^{\text{fine, 0}} - \hat{v}_{k}^{\text{fine,1}} - l_{2}\frac{\lambda}{D_{1}}\right|, \end{array} $$
(25)

where \(\left \lceil \left (-1-\hat {v}_{k}^{\text {fine,1}}\right)\frac {D_{1}}{\lambda } \right \rceil \le l_{2} \le \left \lfloor \left (1-\hat {v}_{k}^{\text {fine,1}}\right) \frac {D_{1}}{\lambda }\right \rfloor \).

3.3.3 Disambiguate \(\hat {v}_{k}^{\text {fine,2}}\) with \(v_{k}^{\text {fine, 1}}\)

Finally, we can disambiguate \(\hat {v}_{k}^{\text {fine,2}}\) with \(v_{k}^{\text {fine, 1}}\) derived above to estimate the final high-accuracy and unambiguous estimation of \(v_{k}^{\text {final}}\):

$$\begin{array}{@{}rcl@{}} v_{k}^{\text{final}} &=& \hat{v}_{k}^{\text{fine, 2}} + \hat{l}_{3} \frac{\lambda}{D_{2}}, \end{array} $$
(26)
$$\begin{array}{@{}rcl@{}} \hat{l}_{3} &=& \arg\!\min_{l_{3}}\left|v_{k}^{\text{fine, 1}} - \hat{v}_{k}^{\text{fine,2}} - l_{3}\frac{\lambda}{D_{2}}\right|, \end{array} $$
(27)

where \(\left \lceil \left (-1-\hat {v}_{k}^{\text {fine,2}}\right)\frac {D_{2}}{\lambda } \right \rceil \le l_{3} \le \left \lfloor \left (1-\hat {v}_{k}^{\text {fine,2}}\right) \frac {D_{2}}{\lambda }\right \rfloor \).

Similar to the above three steps, we can get the final high-accuracy and unambiguous estimation of \(u_{k}^{\text {final}}\) by replacing {Δx,y,D1,D2} with {Δy,z,D3,D4}, respectively.

After getting the unambiguous and high-accuracy estimation of {u,v}, we can get the high-accuracy DOA estimation of kth source by (3) and the results are

$$ \left\{\begin{array}{ccl} \hat{\theta}_{k} &=& \arcsin\left(\sqrt{\left(u_{k}^{\text{final}}\right)^{2} + \left(v_{k}^{\text{final}}\right)^{2}}\right),\\ \hat{\phi}_{k} &=& \arctan\left(\frac{v_{k}^{\text{final}}}{u_{k}^{\text{final}}}\right). \end{array}\right. $$
(28)

3.4 Analysis of the three inter-sensor spacings

Larger inter-sensor spacing brings in larger aperture and further leads to higher direction estimation accuracy. At the same time, it makes the disambiguation more difficult. There is a threshold in the process of disambiguation [36]. When the inter-sensor spacing value is larger than the threshold, the probability of successful disambiguation will break down. Therefore, we analyze the threshold of the inter-sensor spacing by analyzing the success probability of the disambiguation process.

Let us take v as an example to demonstrate the derivation again. According to the proposed array configuration shown in Fig. 2, there are three scales, i.e., {Δx,y,D1,D2} for v. Thus, we utilize the three-order disambiguation process in Section 3.3 to obtain vfinal. Take the Δx,y as an example, recalling Eq. (23), only by satisfying the following equation

$$ \left|v_{k}^{\text{ref}} - v_{k}^{\text{coarse}}\right| < \frac{\lambda}{2\Delta_{x,y}}, $$
(29)

can the disambiguation process be successful. The value of \(\left |v_{k}^{\text {ref}} - v_{k}^{\text {coarse}}\right |\) is the estimation error of the \(v_{k}^{\text {coarse}}\). We hereby assume that the angle estimation error follows a Gaussian distribution [37]. According to the distribution function of the normal process [38], the probability of the sample error falling into the scope of 3σ is about 99.85%, where σ is the standard deviation of the samples. Thus, when the root mean square error (RMSE) of \(v_{k}^{\text {coarse}}\) satisfies

$$ \sigma_{v_{k}^{\text{coarse}}} \le \frac{\lambda}{6\Delta_{x,y}}, $$
(30)

we consider that the disambiguation process is successful. Therefore, we can calculate the threshold of Δx,y by

$$ \Delta_{x,y}^{threshold}=\frac{\lambda}{6\sigma_{v_{k}^{\text{coarse}}}}. $$
(31)

We can obtain the threshold of D1 and D2 using the similar method. Furthermore, considering the practical applications, we can only obtain the Cramér-Rao bounds (CRB) of each parameter rather than RMSE. Thus, we can substitute the RMSE of \(v_{k}^{\text {fine,0}}\) and \(v_{k}^{\text {fine,1}}\) with their CRB to calculate the thresholds of D1 and D2. However, due to the CRB is much less than the RMSE, the calculated values of thresholds of D1 and D2 will be far larger than the actual values. And this probility will be verified in Section 4.

Similar to v, we can obtain the corresponding thresholds of {Δy,z,D3,D4} for u.

Owing to the fact that RMSE is related to signal-to-noise ratio (SNR), the snapshot number and the source direction, we will analyze the influence of these factorsin Section 4.

The derivation of the CRB for the new array is similar to that in [21], and we will use the corresponding equations therein to derive the CRB in the following simulations.

4 Simulation results and discussion

In this section, we conduct simulations to verify the effectiveness and performance of the proposed array geometry and algorithm. For simplicity, we set θ∈[0,π/2], ϕ∈[0,π/2). The coordinate of the hy of the SS-EMVS is (xh,yh,zh)=(7.5λ,7.5λ,5λ). The RMSE of parameter estimation is defined as

$$ \text{RMSE}=\sqrt{\frac{1}{M}\sum\limits_{m=1}^{M}{(\hat{\alpha}_{m}-\alpha)^{2}}}, $$
(32)

where \(\hat {\alpha }_{m}\) is the estimation of mth trial of parameter α, and M is the number of Monte Carlo trials. We assume that the number of sources is known a priori in the following simulations.

4.1 Parameter estimation results

In the first example, we consider that there are N1=12ex’s placed along the y-axis direction and N2=12ez’s placed along the x-axis direction. The first six ex’s compose the sub-array 1 with inter-sensor spacing D1=35λ; the rest of the ex’s constitute the sub-array 2 with inter-sensor spacing D2=7D1=245λ. Besides, the first six ez’s compose the sub-array 3 with inter-sensor spacing D3=35λ; the rest of the ez’s constitute the sub-array 4 with inter-sensor spacing D4=7D3=245λ. For the triangular SS-EMVS, Δx,y=Δy,z=5λ. There are K=2 pure-tone incident sources with unit power, which have the numerical frequency f=(0.537,0.233), elevation θ=(42,35), azimuth ϕ=(55,52), the auxiliary polarization angle γ=(36,60), and the polarization phase difference η=(80,70) impinging on the array. The number of snapshots is L=200 and SNR = 10 dB. The noise is a complex Gaussian white noise vector with zero mean and covariance matrix σ2I. Figure 3 shows the estimation results of the proposed algorithm with 200 Monte Carlo trials. We can see that the spatial parameters of all targets are correctly paired and estimated.

Fig. 3
figure 3

The estimation results of the DOA of two incident sources

4.2 Parameter estimation performance

In order to further exploit the performance of the proposed array, we hereby conduct various simulations with different parameters of the array and sources.

4.2.1 Performance versus SNR

In the first example, we consider the performance of parameter estimation versus SNR. Figure 4 a shows the RMSE of all estimates of u, i.e., ufine,0, ufine,1, and ufinal of the proposed array versus SNR compared with ucoarse and the CRB. Figure 4 b shows the RMSE of all estimates of v, i.e., vfine,0, vfine,1, and vfinal of the proposed array versus SNR compared with vcoarse and the CRB. It can be observed that both ufinal and vfinal improve significantly from their coarse estimates, ucoarse and vcoarse, respectively; both of them are getting closer to their CRB. Moreover, the disambiguation described in Section 3.3 is similar to that of dual-size ESPRIT [4]. There exists a SNR threshold in the process of disambiguation [39]. The parameter estimation performance will be degraded significantly if the SNR is lower than the threshold. When SNR is larger than this threshold, the performance improves dramatically, and the performance is getting better with the increase of SNR. From Fig. 4, we can see that the SNR threshold of u and v are 7 dB and 6 dB, respectively.

Fig. 4
figure 4

The RMSE of (a) ucoarse, ufine,0, ufine,1, and ufinal compared with CRB, and (b) vcoarse, vfine,0, vfine,1, and vfinal compared with CRB using the proposed array

In addition, we compare the proposed array with the array configuration in [32] which has the same number of scalar sensors, and the array configuration in [21] which has the same number of SS-EMVSs in y-axis. Figure 5 a and b shows the RMSE of u and v estimates versus SNR for all three arrays, respectively. Comparing with the 2D nested scalar array in [32], the proposed array has a much larger aperture extension and lower mutual coupling; comparing with the linear multi-scale SS-EMVS array in [21], the proposed array has a much larger aperture extension in x-axis. We can observe from Fig. 5 a that the performance of the proposed array of u estimation is higher than those of the two other arrays when SNR is larger than the threshold. That is because the array aperture of the proposed array in x-axis is much larger than the two other arrays. From Fig. 5 b, we can observe that the performance of the proposed array of v estimation is a little worse than that of the array configuration in [21]. However, the SNR threshold of the proposed array is far smaller (7 dB).

Fig. 5
figure 5

RMSE of u and v estimations of all three arrays versus SNR. a RMSE of u estimations. b RMSE of v estimations

Moreover, we consider another configuration of the proposed array in which the array is extended along y-axis and z-axis, respectively. And the triangular SS-EMVS of this configuration is placed along y-axis and z-axis, respectively. The DOA estimation process of this array is similar to that of the proposed array, except that the corresponding direction cosines change from u and v to v and w. Using the same simulation conditions as in Section 4.1, we compare the parameter estimation performance of this array configuration with the proposed array. The results are given in Fig. 6. We can observe that RMSEs of u of these two configurations are similar, and the same behavior happens for v of the proposed array and w of another configuration. But we still can see that the accuracy of the proposed array are marginally better than the other configuration when SNR is large enough, i.e., > 8 dB.

Fig. 6
figure 6

RMSE of parameters estimations of the two array configurations versus SNR

As the arriving angle estimation is determined by u and v jointly, in Fig. 7, we show the RMSEs of the estimated θ and ϕ of all array configurations versus SNR and the CRB of the proposed array. It can be seen that the SNR threshold of θ and ϕ of the proposed array are both 7 dB, the lowest one of the threshold of u and v. Moreover, the performance of the proposed array is the best in all array configurations. Therefore, our proposed array is a good trade-off of mutual coupling, estimation accuracy, and robustness (lower SNR threshold) to noise.

Fig. 7
figure 7

RMSE of θ and ϕ estimations versus SNR. a RMSE of θ estimations. b RMSE of ϕ estimations

4.2.2 Performance versus snapshot number

In the next example, we consider the performance of DOA estimation versus snapshot number L. Figure 8 a and b show the RMSEs of θ and ϕ estimation of all array configurations versus L at SNR =10 dB, respectively. We can see that the parameter estimation performance of the proposed array improves with the increase of snapshots, and the performance of the proposed array is the best among all array configurations once again.

Fig. 8
figure 8

RMSE of θ and ϕ estimations versus L. a RMSE of θ estimations. b RMSE of ϕ estimations

4.2.3 Performance versus inter-sensor spacing

In the third example, we consider the performance of parameter estimation versus inter-sensor spacings. We take one target as an example, and set Δx,y=Δy,z in the SS-EMVS with SNR =10 dB. The elevation of the target is θ=35, azimuth ϕ=52, the auxiliary polarization angle γ=36, and the polarization phase difference η=80. As mentioned in Section 3.4, there is a threshold of inter-sensor spacing. Figure 9 shows the RMSE of ufine,0 and vfine,0 of the proposed array versus Δx,y. Recalling Eq. (31), we can obtain the threshold of Δy,z at SNR=10 dB is \(\Delta _{y,z}^{t}=7.15\lambda \). From Fig. 9, the threshold of Δy,z is approximately 6λ. Thus, according to the obtained threshold and practical applications, we set Δy,z=5λ. The same method can be performed for Δx,y, \(\Delta _{x,y}^{t}=8.04\lambda \), and the threshold of Δx,y is approximately 8λ from Fig. 9. Therefore, the method derived in Section 3.4 for calculating the thresholds of different inter-sensor spacings is effective. Similar to Δy,z, we set Δx,y=5λ.

Fig. 9
figure 9

Threshold of Δx,y and Δy,z

In the second simulation, we set D1=D3, and the RMSE of ufine,1 and ufine,1 of the proposed array versus D1 is shown in Fig. 10. Similar to Δy,z, we can calculate the threshold of D1 and D3. But there is little different, as mentioned in Section 3.4, we utilize the CRB of ufine,0 and vfine,0 instead of the RMSE to calculate the threshold of D1 and D3. And the calculated value are \(D_{1}^{t}=161.5\lambda \) and \(D_{3}^{t}=197.5\lambda \). From Fig. 10, we can obtain these threshold values (\(D_{1}^{t}=76\lambda \), \(D_{3}^{t}=72\lambda \)). We know that CRB is much smaller than RMSE. Therefore, we should set D1 and D3 much smaller than the calculated threshold values. Thereby, we set D1=D3=35λ.

Fig. 10
figure 10

Threshold of D1 and D3

In the third simulation, we set D2=D4 and plot the RMSE of ufinal and ufinal of the proposed array versus D2 in Fig. 11. Similar to D1, we can calculate the threshold of D2 and D4, \(D_{2}^{t}=8047.6\lambda \) and \(D_{4}^{t}=7857.9\lambda,\) by CRB. From Fig. 11, we can obtain these threshold values \(\left (D_{2}^{t}=2500\lambda, D_{4}^{t}=2300\lambda \right)\). Again, we should set D2 much smaller than the calculated threshold values, and considering the practical applications, we set D2=D4=245λ.

Fig. 11
figure 11

Threshold of D2 and D4

4.2.4 Threshold of inter-sensor spacing versus SNR

We investigate the threshold of inter-sensor spacing versus SNR. Take one target as example, and we set the elevation of the target as θ=35, azimuth ϕ=52, the auxiliary polarization angle γ=36, and the polarization phase difference η=80. Other simulation conditions remain the same with Section 4.1. Figure 12 shows the thresholds of Δy,z and Δx,y versus SNR. It is seen that the thresholds of Δy,z and Δx,y both increases as SNR increases. And the thresholds of {D1,D3} and {D2,D4} are shown in Figs. 13 and 14, respectively. The results are similar to Fig. 12.

Fig. 12
figure 12

Threshold of Δy,z and Δx,y versus SNR

Fig. 13
figure 13

Threshold of D3 and D1 versus SNR

Fig. 14
figure 14

Threshold of D4 and D2 versus SNR

4.2.5 Threshold of SNR versus arriving angle

Lastly, we consider the threshold of SNR in the disambiguation process versus the signal arriving angle. Take one target as the example, we set the auxiliary polarization angle of the target γ=36 and the polarization phase difference η=80. We set another angle equals 45 when we analyze one angle. Other simulation conditions remain the same as in Section 4.1. Figure 15 shows the threshold of SNR of u and v versus θ and ϕ. We can see that the threshold of SNR is approximately symmetrical with 90 for θ and symmetrical with 0 for ϕ. As we set θ∈[0,π/2] and ϕ∈[0,π/2), the threshold of SNR is in a lower range when the target is located in θ∈[20,70] and ϕ∈[20,70].

Fig. 15
figure 15

Threshold of SNR of u and v versus angle. a Threshold of SNR versus θ. b Threshold of SNR versus ϕ

5 Conclusions

In this paper, a new array configuration composed of multiple sparse scalar arrays and a single triangle electromagnetic-vector-sensor is proposed, which enjoys the superiorities of both the spatially spread electromagnetic-vector-sensor and the sparse array. The new array can provide four direction cosine estimates with gradually improved accuracies, which are along the x-axis and y-axis, respectively. Based on this, we developed the algorithm for direction-of-arrival estimation, which utilizes the approach of three-order disambiguation. We have analyzed the thresholds of the inter-sensor spacings in the four uniform scalar sub-arrays and conducted extensive simulations to validate them. We compare the performance of the direction cosine estimations of our array with the 2D nested scalar array and the linear multi-scale SS-EMVS array. These results demonstrated that our proposed array geometry enjoys the optimal trade-off on estimation accuracy, mutual coupling, and robustness to noise. Moreover, since only a single SS-EMVS is used with other scalar sensors, the proposed array has achieved a good performance with small redundancy, less elements, and low cost.