A method for detecting continuous gravitational wave signals from an ensemble of known pulsars

M Buono; R De Rosa; L D’Onofrio; L Errico; C Palomba; O J Piccinni; V Sequino

doi:10.1088/1361-6382/abf1c0

1. Introduction

Gravitational wave astronomy began on September 14th, 2015 when LIGO Hanford (LHO) and LIGO Livingston observatories (LLO) detected the first gravitational wave signal—GW150914—produced by the merger of two black holes [1]. GW150914 demonstrated the existence of binary stellar-mass black hole systems and was also the first observation of a binary black hole merger [2]. The first confirmed multi-messenger counterpart to a gravitational wave observation came instead with GW170817 [3]. GW170817 originated from a binary neutron star coalescence which was accompanied by detections across the electromagnetic spectrum.

So far, detected signals come from coalescences of compact objects and are transient signals confined in a short time window. Continuous gravitational waves (CWs) are a different class of signals, potentially observable by the LIGO and Virgo detectors.

Isolated spinning neutron stars with asymmetric mass distribution, with respect to their rotation axis, are possible sources of CWs. The gravitational signal from this type of astrophysical sources is periodic and the frequency of the signal is linked to the source's rotational frequency. If the neutron star is a pulsar, accurate ephemerides may be available from electromagnetic observations. Hence, matched filtering techniques can be employed using waveform templates that cover the entire observing period.

Recently, the LIGO–Virgo collaboration presented a search [4] for gravitational waves from 221 pulsars with rotational frequencies above 10 Hz using advanced LIGO data from its first and second observing runs. This search found no evidence for gravitational wave emission from any pulsar at either the rotation frequency and twice it. For 22 pulsars, the experimental upper limits on the amplitude were below the theoretical spin-down limits. The analysis of LIGO and Virgo data of the run O3a (April 1st, 2019–October 1st, 2019) allowed, for the first time, to beat the spin-down limit for a millisecond pulsar [5].

In this context, this work tries to improve the detection efficiency for CWs, combining information from several weak sources, such as known pulsars, that could not be individually detectable.

This search assumes that the gravitational frequency is exactly twice the star's rotational frequency. Methods to detect gravitational waves from an ensemble of known pulsars have already been presented by Fan, Chen and Messenger, combining F-statistic values [6], and by Pitkin, Fan and Messenger, using a hierarchical Bayesian method [7].

In this paper, we define a new statistic t for the ensemble as the linear combination of the single pulsar statistics, expressed through the five-vector method [8]. The resulting search pipeline is computationally very efficient. It can be extended in a straightforward way to narrow-band searches, where the assumption that the CW emission frequency is exactly two times the star's rotational frequency is relaxed. The pipeline is based on the use of the band-sampled data format (BSD) [9], which allows to significantly reduce the computational cost compared to other ensemble methods.

This paper is organized as follows. In section 2, we provide a brief background on the CW signal and the five-vector method. In section 3, we introduce the new ensemble statistic. In section 4, through Monte Carlo simulations, we characterize the ensemble statistic, find a suitable analytical approximation and compute receiver operating characteristics (ROC) curves. In section 5, we test the performance of the analysis method by considering several ensembles containing a different number of simulated signals and by varying the signal amplitude. In section 6, we compare and test three reasonable criteria to rank pulsars in the ensemble using simulated signals and O2 data [10]. Conclusions are presented in section 7.

2. The gravitational wave signal

The CW signal emitted by a triaxial neutron star, rotating at frequency f_rot around one of its principal axes of inertia, can be written at the detector as [11]:

$\begin{equation}h\left(t\right)=\frac{1}{2}{h}_{0}\left(1+{\mathrm{cos}}^{2}\iota \right)\mathrm{cos}\enspace {\Phi}\left(t\right){F}_{+}\left(t,\psi \right)+{h}_{0}\enspace \mathrm{cos}\enspace \iota \enspace \mathrm{sin}\enspace {\Phi}\left(t\right){F}_{{\times}}\left(t,\psi \right),\end{equation} \tag{ 1 }$

where F₊(t, ψ) and F_×(t, ψ) are the two detector beam-pattern functions, which depend on the polarization angle ψ.⁵ ι is the angle between the star rotation axis and the line of sight and

$\begin{equation}{h}_{0}=\frac{16{\pi }^{2}G}{{c}^{4}}\frac{I{\epsilon}{f}_{\mathrm{r}\mathrm{o}\mathrm{t}}^{2}}{d},\end{equation} \tag{ 2 }$

is the gravitational wave amplitude. I is the star moment of inertia with respect to the rotation axis, the equatorial ellipticity and d the distance from the Earth.

Φ(t), in equation (1), defines the gravitational wave phase evolution of the star, inferred from electromagnetic observations (typically in the radio, gamma or x-ray band). The signal phase is affected by the Doppler effect due to the detector motion, the intrinsic source spin-down and relativistic effects, like the Einstein delay.

In a different formalism, introduced in [8], and used in several targeted and narrow-band searches of CWs from known pulsars, e.g. [4, 5, 12–14], the emitted waveform is described in terms of ψ and η, the ratio of the polarization ellipse semi-minor to semi-major axes. η varies in the range [−1, 1] (η = 0 for a linearly polarized wave and η = ±1 for a circularly polarized wave) and is linked to ι by the following relation:

$\begin{equation}\eta =-\frac{2\enspace \mathrm{cos}\enspace \iota }{1+{\mathrm{cos}}^{2}\iota }.\end{equation} \tag{ 3 }$

In this formalism, the complex form of equation (1) can then be expressed as:

$\begin{equation}h\left(t\right)={H}_{0}\left({H}_{+}{A}_{+}\left(t\right)+{H}_{{\times}}{A}_{{\times}}\left(t\right)\right){\mathrm{e}}^{\mathrm{i}{\Phi}\left(t\right)},\end{equation} \tag{ 4 }$

where

$\begin{equation}{H}_{+}=\frac{\mathrm{cos}\left(2\psi \right)-\mathrm{i}\eta \enspace \mathrm{sin}\left(2\psi \right)}{\sqrt{1+{\eta }^{2}}}\hspace{25.0pt}{H}_{{\times}}=\frac{\mathrm{sin}\left(2\psi \right)+\mathrm{i}\eta \enspace \mathrm{cos}\left(2\psi \right)}{\sqrt{1+{\eta }^{2}}},\end{equation} \tag{ 5 }$

$\begin{equation}{H}_{0}={h}_{0}\sqrt{\frac{1+6\enspace {\mathrm{cos}}^{2}\iota +{\mathrm{cos}}^{4}\iota }{4}}.\end{equation} \tag{ 6 }$

A₊ and A_× are periodic functions of the Earth sidereal angular frequency Ω_⊕. This dependence is at the base of the five-vector method, that we use in this work.

After correcting for the Earth-induced Doppler effect, the Einstein delay and the source spin-down (using heterodyne demodulation [9]), the signal at the detector becomes monochromatic apart from an amplitude and phase sidereal modulation:

$\begin{equation}h\left(t\right)={H}_{0}\left({H}_{+}{A}_{+}\left(t\right)+{H}_{{\times}}{A}_{{\times}}\left(t\right)\right){\mathrm{e}}^{\mathrm{i}\left({\omega }_{0}t+\gamma \right)}.\end{equation} \tag{ 7 }$

This expression is the product of a faster periodic term, with frequency f_gw = ω₀/2π = 2f_rot and phase γ, and a slower term given by a linear combination of sines and cosines with arguments Ω_⊕ and 2Ω_⊕ [8].

Then, the signal at the detector is completely described by its Fourier components at the five angular frequencies ω₀, ω₀ ± Ω_⊕, ω₀ ± 2Ω_⊕, which constitute the signal five-vector. For a generic time series x(t), the five-vector is defined as

$\begin{equation}\mathbf{X}={\int }_{{T}_{\mathrm{o}\mathrm{b}\mathrm{s}}}x\left(t\right){\mathrm{e}}^{-\mathrm{i}\mathbf{k}{{\Omega}}_{\oplus }t}{\mathrm{e}}^{-\mathrm{i}{\omega }_{0}t}\mathrm{d}t\quad \text{with}\quad \mathbf{k}=\left(0,{\pm}1,{\pm}2\right),\end{equation} \tag{ 8 }$

where T_obs is the observation time. The method consists of computing two matched filters, in the frequency domain, among the data five-vector, X, and the signal template five-vectors A₊, A_×:

$\begin{equation}{\hat{H}}_{+}=\frac{\mathbf{X}\cdot {\mathbf{A}}^{+}}{\vert {\mathbf{A}}^{+}{\vert }^{2}}\quad \text{and}\quad {\hat{H}}_{{\times}}=\frac{\mathbf{X}\cdot {\mathbf{A}}^{{\times}}}{\vert {\mathbf{A}}^{{\times}}{\vert }^{2}}.\end{equation} \tag{ 9 }$

A₊ and A_× are obtained applying the definition (8) by replacing the time series x(t) with the two functions A₊(t) and A_×(t). The signal template five-vectors depend exclusively on known parameters (source position, detector position and sidereal response of the detector).

It can be shown [8] that the two quantities in (9) are estimators of, respectively, H₀e^iγ H₊ and H₀e^iγ H_× and that can be used to estimate the unknown parameters (H₀, ψ, η).

3. Single signal and ensemble statistics

The detection statistic for a single pulsar can be chosen as the linear combination of the modulus of the two complex estimators in (9):

$\begin{equation}S={c}_{+}\vert {\hat{H}}_{+}{\vert }^{2}+{c}_{{\times}}\vert {\hat{H}}_{{\times}}{\vert }^{2}\equiv {c}_{+}{S}_{+}+{c}_{{\times}}{S}_{{\times}},\end{equation} \tag{ 10 }$

where the S₊ and S_× distributions are known (see [15]). Assuming Gaussian noise with zero mean value and variance σ², we study the coefficients c₊ and c_× maximizing the critical ratio (CR) defined as:

$\begin{equation}\mathrm{C}\mathrm{R}=\frac{{\left({\mu }_{\mathrm{s}\mathrm{i}\mathrm{g}}-{\mu }_{\mathrm{n}}\right)}^{2}}{{{\Theta}}_{\mathrm{n}}^{2}},\end{equation} \tag{ 11 }$

where μ_sig is the mean of the distribution of the statistic S when a signal is present into the data, while μ_n and ${{\Theta}}_{\mathrm{n}}^{2}$ are respectively the mean and variance of S when there is noise only.

Maximizing the CR with respect to the two coefficients, the resulting equations are not independent (see appendix A). There is a straight line in the plane (c₊, c_×) where the function CR has the maximum value. A possible choice that satisfies this condition is:

$\begin{equation}{\bar{c}}_{+}=\vert {H}_{+}{\vert }^{2}\vert {\mathbf{A}}_{+}{\vert }^{4}\quad \text{and}\quad {\bar{c}}_{{\times}}=\vert {H}_{{\times}}{\vert }^{2}\vert {\mathbf{A}}_{{\times}}{\vert }^{4}.\end{equation} \tag{ 12 }$

To construct a detection statistic for an ensemble of signals, we introduce the following general expression:

$\begin{equation}t=\sum\limits _{i=1}^{{N}_{\mathrm{s}}}\enspace {a}_{i}{S}_{i},\end{equation} \tag{ 13 }$

where N_s is the number of considered signals and S_i is the single statistic with coefficients in (12). As in the previous case, it is possible to infer the theoretically best choice for the coefficient a_i of the ensemble statistic t, considering the CR:

$\begin{equation}\mathrm{C}\mathrm{R}=\frac{{\left[{\sum }_{j=1}^{{N}_{\mathrm{s}}}\enspace {a}_{j}\left({\mu }_{\mathrm{s}\mathrm{i}\mathrm{g},j}-{\mu }_{\mathrm{n},j}\right)\right]}^{2}}{{\sum }_{i=1}^{{N}_{\mathrm{s}}}\enspace {a}_{i}^{2}{{\Theta}}_{\mathrm{n},i}^{2}}\equiv \frac{{\left[{\sum }_{j=1}^{{N}_{\mathrm{s}}}\enspace {a}_{j}{\lambda }_{j}\right]}^{2}}{{\sum }_{i=1}^{{N}_{\mathrm{s}}}\enspace {a}_{i}^{2}{{\Theta}}_{\mathrm{n},i}^{\enspace 2}}.\end{equation} \tag{ 14 }$

Maximizing the CR as a function of the coefficients, there is a hyperplane where the function $\mathrm{C}\mathrm{R}\left({a}_{1},\dots ,{a}_{{N}_{\mathrm{s}}}\right)$ has the maximum value. A particularly simple choice of the coefficients, that maximizes the CR, is:

$\begin{equation}{\overline{a}}_{k}=\frac{{\lambda }_{k}}{{{\Theta}}_{\mathrm{n},k}^{\enspace 2}}=\frac{{H}_{0,k}^{2}}{{\sigma }_{k}^{4}\cdot {T}_{\mathrm{o}\mathrm{b}\mathrm{s}}^{2}},\end{equation} \tag{ 15 }$

where ${\sigma }_{k}^{2}$ is the variance of the Gaussian data distribution around the expected frequency for the kth pulsar.

We will indicate by $\bar{S}$ and $\bar{t}$ , the statistic of single signal and ensemble with coefficients respectively in (12) and (15).

For a real signal, ${\bar{c}}_{+},{\bar{c}}_{{\times}}$ and ${\bar{a}}_{k}$ are the theoretical coefficients that maximize the CR, but they are related to unknown quantities. Indeed, |H_+,×|² depend on the generally unknown polarization parameters ψ, η and the amplitude H₀ is also unknown.

For this reason, we need approximate expressions, calculable from the data, for the coefficients in (12) and (15).

4. Approximate expressions for $\bar{S}$ and $\bar{t}$ coefficients

To compare different possible choices for the coefficients, we construct the statistic distributions and analyze the ROC curves. ROC curves for $\bar{S}$ and $\bar{t}$ are theoretical limits that we want to approach as much as possible with an appropriate choice for the coefficients.

We build the statistic distributions using software injections of simulated pulsar signals. For each signal, we fix η, ψ and the injected amplitude H, defined as:

$\begin{equation}H=\alpha \cdot {h}_{\mathrm{m}\mathrm{i}\mathrm{n}},\end{equation} \tag{ 16 }$

with 0 < α < 1, where h_min is the minimum detectable strain. In this way, the simulated pulsars cannot be detected individually since injected amplitudes are below the minimum detectable value. For a targeted search [16]:

$\begin{equation}{h}_{\mathrm{m}\mathrm{i}\mathrm{n}}\approx 11\sqrt{\frac{S\left({f}_{\mathrm{g}\mathrm{w}}\right)}{{T}_{\mathrm{o}\mathrm{b}\mathrm{s}}}},\end{equation} \tag{ 17 }$

where S(f_gw) is the one-sided power spectral density at the expected signal frequency f_gw. The minimum detectable strain is defined as the minimum signal amplitude detected over the observation time T_obs, assuming a false alarm probability of 1% and a detection probability of 90%.

The detection statistic distribution can be built from the data itself considering a range of off-source frequencies, close but different from the one where the signal is expected [15]. In this way, we can use the same data to simulate a different noise background for each fixed signal. In these off-source frequencies, we inject a signal with the same amplitude and the same selected source parameters. Since software injections provide the expected Doppler and Einstein delay and the spin-down frequency variation to the signal, it is necessary to make the appropriate corrections for each simulation (that is for each injection). Instead, in the case of noise, it suffices to evaluate the statistic at the selected off-source frequencies without signal injections.

The software injections are performed using a collection of BSD files [9] each covering eight months (from 4 January 2017 until the end of O2 run) of LIGO Livingston detector (LLO) and 10 Hz frequency band.

To compute ROC curves, we need to build the distribution of S in the presence of signals. We consider three possible pairs for the coefficients c_+/×: c_+/× = 1/2; c_+/× = |A_+/×|⁴; c_+/× = |A_+/×|². In all cases, the signal distribution is a weighted sum of non-central χ² random variables, and in general, there is no simple analytical expression for this distribution.

Empirically, we find that a Gamma distribution can fit the experimental distribution to a good approximation for all the analyzed combinations of c₊ and c_×. The shape and scale parameters of this Gamma distribution are inferred from the mean and variance of S.

This approximation does not depend on fixed parameters (source parameters, η, ψ) and on the injected amplitude H. We notice that the approximation gets slightly worse increasing the spin-down value ${\dot {f}}_{\hspace{-2.5pt}\mathrm{g}\mathrm{w}}$ and leaving other parameters unchanged (figure 1). In any case, high spin-down values (∼−10⁻⁹ Hz s⁻¹ or more) are not usual for known pulsars.

**Figure 1.** Experimental signal distribution (blue histogram) for the single pulsar statistic with c_+/× = |A_+/×|⁴ for two different spin-down values ( ${\dot {f}}_{\hspace{-2.5pt}\mathrm{g}\mathrm{w}}=-7{\times}1{0}^{-9}\enspace \mathrm{H}\mathrm{z}\enspace {\mathrm{s}}^{-1}$ for the left plot, ${\dot {f}}_{\hspace{-2.5pt}\mathrm{g}\mathrm{w}}=-7{\times}1{0}^{-11}\enspace \mathrm{H}\mathrm{z}\enspace {\mathrm{s}}^{-1}$ for the right plot) and leaving other parameters unchanged. The black line is the fitting Gamma distribution.
Download figure:
Standard image High-resolution image

**Figure 1.** Experimental signal distribution (blue histogram) for the single pulsar statistic with c_+/× = |A_+/×|⁴ for two different spin-down values ( ${\dot {f}}_{\hspace{-2.5pt}\mathrm{g}\mathrm{w}}=-7{\times}1{0}^{-9}\enspace \mathrm{H}\mathrm{z}\enspace {\mathrm{s}}^{-1}$ for the left plot, ${\dot {f}}_{\hspace{-2.5pt}\mathrm{g}\mathrm{w}}=-7{\times}1{0}^{-11}\enspace \mathrm{H}\mathrm{z}\enspace {\mathrm{s}}^{-1}$ for the right plot) and leaving other parameters unchanged. The black line is the fitting Gamma distribution.
Download figure:
Standard image High-resolution image

For the single pulsar statistic S, in agreement with [8], the analysis of the ROC curves with different combinations of c₊ and c_× confirms that the best choice is:

$\begin{equation}{c}_{+}=\vert {\mathbf{A}}_{+}{\vert }^{4}\quad \text{and}\quad {c}_{{\times}}=\vert {\mathbf{A}}_{{\times}}{\vert }^{4}.\end{equation} \tag{ 18 }$

For the ensemble statistic t, we analyze different choices for the coefficients a_i in (13) that take into account the sensitivity of the detector, ${a}_{i}={S}^{-1}\left({f}_{\mathrm{g}\mathrm{w},i}\right)\cdot {T}_{\mathrm{o}\mathrm{b}\mathrm{s}}^{-1}\enspace$ , the squared sensitivity, ${a}_{i}={S}^{-2}\left({f}_{\mathrm{g}\mathrm{w},i}\right)\cdot {T}_{\mathrm{o}\mathrm{b}\mathrm{s}}^{-1}\enspace$ , both the sensitivity and the sidereal response of the detector, ${a}_{i}={\left(\vert {\mathbf{A}}_{+,i}{\vert }^{2}+\vert {\mathbf{A}}_{{\times},i}{\vert }^{2}\right)}^{2}\cdot {S}^{-1}\left({f}_{\mathrm{g}\mathrm{w},i}\right)\cdot {T}_{\mathrm{o}\mathrm{b}\mathrm{s}}^{-1}\enspace$ .

For the signal distribution of t, the gamma function is again a good approximation for all the selected combination of a_i (see [17] and Figure 2 for a particular case).

**Figure 2.** Ensemble t statistic distributions with ${a}_{i}={S}^{-1}\left({f}_{\mathrm{g}\mathrm{w},i}\right)\cdot {T}_{\mathrm{o}\mathrm{b}\mathrm{s}}^{-1}$ for the four simulated pulsars in table B1. For the noise distribution, the black line is the expected theoretical distribution. For the signal distribution, the blue line is the gamma distribution with the expected parameters.
Download figure:
Standard image High-resolution image

By analyzing many different sets of simulated pulsars and varying the injected amplitudes, we find that the best choice of the coefficients is:

$\begin{equation}{a}_{i}=\frac{1}{S\left({f}_{\mathrm{g}\mathrm{w},i}\right)\cdot {T}_{\mathrm{o}\mathrm{b}\mathrm{s}}}.\end{equation} \tag{ 19 }$

This choice guarantees the ROC curve closest to the ROC curve of the theoretical statistic $\bar{t}$ . Since the coefficients in (19) do not depend on the source and signal parameters⁶ , this is the best choice for every value of η and ψ.

An example is shown in figure 3, obtained by considering the ensemble of signals with parameters given in table B2 and considering Gaussian noise. The left plot shows the detection probability as a function of the false alarm probability for the whole set of simulated signals. The right plot shows the detection probability as a function of the pulsars' number in the ensemble, ranked by decreasing values of the parameter α. For all the choices of the coefficients a_i, the detection probability increases with an increasing number of signals in the ensemble up to a maximum and then starts to decrease. As discussed in the next section, this is because we are adding smaller and smaller signals which do not contribute to the ensemble signal but, rather, to the noise.

Measurement errors on the pulsar parameters, like distance or position, could affect the level of the dashed lines in figure 3. Nevertheless, since we cannot use the coefficients in (15) in real analysis, we do not expect a strong impact from source parameters uncertainties.

Table B1. Table of the software injected signals parameters, used to evaluate the noise and signals distributions of the statistic t. The sky position of the injected sources is specified by the right ascension a and by the declination δ in degrees. The polarization angle ψ is in degrees too. The injected amplitude H is equal to H = α ⋅ h_min with 0 < α < 1, where h_min is the minimum detectable strain amplitude given by (17).

Inj. name	(a, δ)	f₀ (Hz)	${\dot {f}}_{0}\enspace \left(\mathrm{H}\mathrm{z}\enspace {\mathrm{s}}^{-1}\right)$	α	η	ψ
PulsarA	(178.37, −33.43)	106.71	0	0.50	0.28	42
PulsarB	(150.33, −10.76)	106.24	7 × 10⁻¹¹	0.50	0.44	22
PulsarC	(120.32, 10.45)	106.57	7 × 10⁻¹⁸	0.70	0.16	28
PulsarD	(90.33, 30.76)	106.83	7 × 10⁻¹¹	0.70	0.13	27

Table B2. Table of the 31 simulated signals parameters used in this work. h_sd is the spin-down amplitude [11, 14]. Source parameters (right ascension a, declination δ, distance, frequency and frequency evolution) are taken from the Australia Telescope National Facility (ATNF) Pulsar Catalogue [20, 21]. The injected amplitude H, equal to H = α ⋅ h_min with 0 < α < 1, satisfies the condition H < 0.50 ⋅ h_sd.

a	δ	f_gw (Hz)	h_sd	α
30^m	4°51'	411.06	3.90 × 10⁻²⁷	0.04
04^h37^m	47°15'	347.38	1.60 × 10⁻²⁶	0.10
05^h34^m	22°00'	59.33	5.00 × 10⁻²⁶	0.50
08^h35^m	45°10'	22.39	2.90 × 10⁻²⁵	0.50
09^h40^m	54°28'	22.84	1.20 × 10⁻²⁵	0.10
10^h28^m	58°19'	21.88	1.20 × 10⁻²⁵	0.10
11^h05^m	61°07'	31.65	6.00 × 10−26	0.50
11^h12^m	61°03'	30.78	1.90 × 10⁻²⁶	0.10
13^h00^m	12°40'	321.62	5.80 × 10⁻²⁷	0.10
14^h20^m	60°48'	29.64	1.60 × 10⁻²⁵	0.50
15^h09^m	58°50'	22.49	6.70 × 10⁻²⁶	0.10
15^h31^m	56°10'	23.75	1.10 × 10⁻²⁵	0.10
15^h37^m	11°55'	52.76	6.10 × 10⁻²⁷	0.10
18^h09^m	19°17'	24.17	1.40 × 10−25	0.50
18^h09^m	23°32'	13.62	4.43 × 10⁻²⁵	0.04
18^h13^m	12°46'	41.60	1.90 × 10⁻²⁵	0.50

a	δ	f_gw (Hz)	h_sd	α
18^h26^m	12°56'	18.14	6.92 × 10−25	0.50
18^h28^m	11°01'	27.75	5.00 × 10⁻²⁶	0.10
18^h31^m	09°52'	29.73	7.70 × 10⁻²⁶	0.50
18^h33^m	08°27'	23.45	6.20 × 10⁻²⁶	0.10
18^h37^m	06°04'	20.77	8.90 × 10⁻²⁶	0.10
18^h41^m	01°30'	67.18	4.20 × 10⁻²⁷	0.04
18^h56^m	02°45'	24.72	6.90 × 10⁻²⁶	0.10
19^h13^m	10°11'	55.70	5.40 × 10⁻²⁶	0.50
19^h25^m	17°20'	26.43	3.10 × 10−26	0.10
19^h28^m	17°46'	29.10	4.33 × 10⁻²⁶	0.10
19^h35^m	20°25'	24.96	8.10 × 10⁻²⁶	0.10
19^h52^m	32°52'	50.59	1.00 × 10⁻²⁵	0.50
20^h43^m	27°40'	20.80	6.30 × 10⁻²⁶	0.10
21^h24^m	33°58'	405.59	4.10 × 10⁻²⁷	0.04
22^h29^m	61°14'	38.71	3.30 × 10⁻²⁵	0.50

5. Scaling with the number and amplitude of signals

Using the set of pulsars in table B2, we analyze how the ensemble detection efficiency depends on the number of pulsars and on their injected amplitudes, that is on the chosen values of α. For this analysis, we consider just the LIGO Livingston detector.

First, we use the $\bar{t}$ statistic to check how the detection probability changes on theoretical ground. Then, we evaluate the performance of the statistic with the experimental coefficients in (19), that can be used in real analysis.

For this study, we use Gaussian noise for our simulations and the gamma distribution approximation to the ensemble statistic. The approximation allow us to reduce the computational cost of the analysis, avoiding the need for software injections for each pulsar.

For each pulsar, we fix the value of α in the set {0.04, 0.1, 0.5}, providing that the resulting injected amplitude would be below the 50% of the theoretical spin-down limit for that pulsar even if the maximum value was used. We indicate, for example, by H4-M11-L4, the ensemble of signals composed of 4 signals with 'high' α = 0.5, 11 signals with 'medium' α = 0.1 and 4 signals with 'low' α = 0.04.

The choice of just three values for the parameter α is arbitrary but sufficient to describe the main features, potentialities and limitations of the proposed analysis method. A more sophisticated approach based, for instance, on the use of a continuous probability distribution for the parameter α, would be more meaningful only apparently. In this case, the results would depend on the choice of the (unknown) distribution and a large number of signals should be considered with a consequent increase of the computational load.

As shown in figure 4 for $\bar{t}$ , the detection efficiency for a chosen false alarm probability, increases considering a larger set, even if the added pulsars have 'low' amplitudes.

This happens because the coefficient a_i of $\bar{t}$ is proportional to the ith pulsar squared amplitude. In other terms, the single pulsar statistic is weighted by the injected amplitude.

In real cases, pulsars' amplitudes are unknown. For this reason, the choice in (19) does not take into account the real 'strength' of the signals.

Figure 5 shows that, in this case, an increasing number of pulsars does not always provide an improvement of the detection efficiency. For instance, whereas combining four pulsars with α = 0.5 implies an increase of almost 50% compared to the single pulsar case, adding pulsars with lower amplitude (α = 0.04, 0.1) reduces the detection probability.

In practice, if the signals in the ensemble are, individually, significantly below the detection level, the performances of the ensemble analysis are degraded (as in the case of figure 3-right). As will be clearer in the next section, the performance improves by adding more detectors.

6. Method for real analysis

In this section, we describe the procedure for applying the ensemble method to a set of real pulsars. In order to improve the detection probability for the ensemble method, we need a way to rank real pulsars trying to estimate their signal 'strength' (that in section 5 is fixed by the α factor in (16)).

Indeed, as discussed in section 5, the amplitude of the signals emitted by the individual sources in the ensemble plays a role in the detection efficiency. Consequently, we decide to run different analyses considering an increasing number of sources in order to maximize the detection probability. In practice, we cannot estimate the 'strength' of the expected pulsar signals since the gravitational wave parameters (η, ψ and H₀) are unknown. P-value and coherence are independent statistical parameters that could be used to rank the sources in the ensemble. Considering a multi-detector analysis, we evaluate three reasonable criteria (multi-detector coherence and two multi-detector p-value based criteria) described in the following. We test these criteria using simulated signals injected in O2 data, jointly considering both LLO and LHO data and the set of pulsars in table B2.

The first used criterion is the coherence defined in [8] as:

$\begin{equation}c=\frac{\vert \hat{h}\hat{\mathbf{A}}{\vert }^{2}}{\vert \mathbf{X}{\vert }^{2}}\quad \text{with}\quad \mathbf{A}={\hat{H}}_{+}{\mathbf{A}}_{+}+{\hat{H}}_{{\times}}{\mathbf{A}}_{{\times}}\mathbf{,}\end{equation} \tag{ 20 }$

where $\hat{h}$ is the estimated complex amplitude $\hat{h}=\hat{{h}_{0}{\mathrm{e}}^{\mathrm{i}\gamma }}=\frac{\mathbf{X}\cdot \mathbf{A}}{\vert \mathbf{A}{\vert }^{2}}$ .

The coherence in (20) is a number between 0 and 1 that measures the resemblance between the shape of the expected signal and the data. In fact, it does not depend on scaling factors on the signal but just on its shape.

As shown in figure 6-left in the case of noise, the probability of obtaining a value of coherence larger than a given value is a decreasing function of the number of datasets (or detectors). Figure 6-right shows the experimental coherence distribution, obtained injecting three sets of 2000 simulated signals in LLO data with, respectively, α = 0.04, 0.5, 1. It is clear that for α = 0.04, the distribution is not distinguishable from noise. Considering more detectors may help to obtain values of coherence less compatible with noise for a given signal amplitude.

An example is shown in figure 7, where we plot the distribution of the multi-detector coherence of the LHO–LLO network for different α values. In this case, at least qualitatively, already for α = 0.2 the distribution starts to be distinguishable from noise.

In our analysis, we use multi-detector coherence to estimate the signal 'strength'.

As further criteria, we also use the arithmetic μ_a and the geometric μ_g mean of p-values for single pulsar statistic in each detector (μ_a = (p_L + p_H)/2 and ${\mu }_{\mathrm{g}}=\sqrt{{p}_{\mathrm{L}}\cdot {p}_{\mathrm{H}}}$ where p_L and p_H are p-values for S in LLO and LHO respectively). Since detectors have different sensitivities, the geometric mean is chosen to weigh more low p-values.

We rank sources in table B2 in three different ways using decreasing values of multi-detector coherence and increasing values of μ_a and μ_g.

Since LLO O2 data are rather non-stationary and non-Gaussian in the first months of the run at low frequencies [19], we use LLO data only from 04/13/2017 to the end of O2 run for pulsars with expected GW frequency below 30 Hz. To compute the p-values of single pulsars, we use the theoretical noise distribution of S, based on the Gaussian assumption. Evidently, the robustness of this assumption depends on the pulsar gravitational wave frequency and the detector data.

We define a joint ensemble statistic as:

$\begin{equation}t\left(n\right)=\sum\limits _{i=1}^{n}\enspace \left({a}_{i,\mathrm{L}}\cdot {S}_{i,\mathrm{L}}+{a}_{i,\mathrm{H}}\cdot {S}_{i,\mathrm{H}}\right),\end{equation} \tag{ 21 }$

where the subscript i indicates the ith pulsar in the ensemble, a_i are the coefficients in (19), while the subscripts L, H indicate LLO and LHO detectors respectively.

We consider three different cases to test the three ranking methods: a noise analysis with no injections and two ensembles, M21-L10 and H5-M20-L6, in which signal parameters are given in table B2. Results are shown in figures 8 and 9, where the p-value for the ensemble statistic t(n) is reported as a function of the number n of signals in the ensembles. In each case, the three indicators behave similarly, with the coherence that on average provides slightly larger p-values when smaller signals are added.

**Figure 8.** P-values computed using the analytical noise distribution of statistic t considering different sets of real pulsars (table B2) ranked by the three different criteria described in the text for LLO and LHO O2 data.
Download figure:
Standard image High-resolution image

For the first ensemble, consisting of 'medium' and 'low' signals, see figure 9-left, the minimum p-value is ∼0.4%, so not fully consistent with noise compared to figure 8. If this result came out in real analysis, some follow-up steps would be taken in order to increase confidence in the detection. Indeed, a noise fluctuation or some detector artifacts could produce a low p-value for a single (or a few) pulsar in a given detector.

In the presence of even a small number of stronger signals, see figure 9-right, the resulting p-value of ensemble is inconsistent with noise for most of the subsets of sources within the ensemble.

In this paper, we do not try to assess the pipeline 'classical' sensitivity, in the form of a curve representing the minimum detectable strain amplitude as a function of the frequency, because—as discussed before—this would imply to assume an arbitrary model for the source parameter distribution. On the contrary, we are mainly interested in showing, with the limits and caveats discussed so far, that the chances of detection can significantly increase by considering an ensemble of weak, individually undetectable, signals.

7. Conclusion

In this paper, we propose a novel detection statistic for gravitational wave signals from an ensemble of known pulsars. This approach aims to improve the detection efficiency by combining single pulsar statistics based on the five-vector concept. The use of the five-vector formalism, especially in the context of the BSD framework, significantly reduces the computational cost of the analysis with respect to other CW ensemble analyses.

Compared to [6, 7], the method described in this paper uses a frequentist approach. Indeed, the significance of the ensemble analysis is expressed through a p-value, which is a measure of how compatible the data are with pure noise. P-value of the ensemble is obtained by considering the theoretical noise distribution of the detection statistic t, assuming Gaussian noise, and comparing it to the value of the detection statistic found in the current analysis.

In this paper, we infer the theoretical optimal coefficients for the ensemble statistic. Using simulated pulsars to construct ROC curves, we find an approximate expression which takes into account the detector sensitivity at the expected signal frequency of each pulsar.

In agreement with previous works [6], the detection efficiency increases when combining few 'strong' sources and then decreases as more weaker sources are added to the ensemble. We introduce three reasonable criteria to evaluate the signal strength and to select the most promising sources in a given set, which provide similar results. The whole analysis pipeline has been tested by injecting simulated signals both in Gaussian data and LIGO O2 data, showing that it is able to detect the signal emitted by an ensemble of sources too weak to be individually detectable.

For the full analysis of detector data, we need to consider the possible frequency glitches for some of the pulsars in the ensemble. Moreover, we need to establish a reliable procedure to compute upper limits in the case of no detection. Once this is accomplished, we plan to analyze the most recent LIGO and Virgo runs.

Our method can be extended in a straightforward way to a multi-detector narrow-band search, in order to improve robustness with respect to uncertainties in pulsar parameters or to take into account a possible mismatch among electromagnetic and gravitational wave parameters.

It is important to highlight that the detection of a gravitational wave signal from an ensemble cannot provide information on the single pulsars' parameters. However, a detection would be clear evidence of the presence of CW sources in the Galaxy.

Data availability statement

The data that support the findings of this study are openly available at the following URL/DOI: https://gw-openscience.org/O2/.

Acknowledgments

This research has made use of data obtained from the GravitationalWave Open Science Center (https://gwopenscience.org), a service of LIGO Laboratory, the LIGO Scientific collaboration and the Virgo collaboration. LIGO is funded by the US National Science Foundation. Virgo is funded by the French Centre National de Recherche Scientifique (CNRS), the Italian Istituto Nazionale della Fisica Nucleare (INFN) and the Dutch Nikhef, with contributions by Polish and Hungarian institutes. The authors would like to acknowledge the Amaldi Research Center for support.

Appendix A.: Theoretical coefficients

In the case of Gaussian noise with zero mean value and variance σ² and no signal, S₊ and S_× in (10) have exponential distributions with mean values [15]:

$\begin{equation}{\mu }_{+,{\times}}=\frac{{\sigma }^{2}\cdot {T}_{\mathrm{o}\mathrm{b}\mathrm{s}}}{\vert {\mathbf{A}}^{+,{\times}}{\vert }^{2}}=\frac{2}{{k}_{+,{\times}}}\quad \text{with}\quad {k}_{+/{\times}}=2\cdot \frac{\vert {\mathbf{A}}^{+/{\times}}{\vert }^{2}}{{\sigma }^{2}\cdot {T}_{\mathrm{o}\mathrm{b}\mathrm{s}}},\end{equation} \tag{ A.1 }$

and T_obs is the effective observation time that takes into account the detector duty circle. It follows that the mean and variance of S in equation (11) are:

$\begin{equation}{\mu }_{\mathrm{n}}={c}_{+}\frac{2}{{k}_{+}}+{c}_{{\times}}\frac{2}{{k}_{{\times}}}\hspace{25.0pt}{{\Theta}}_{\mathrm{n}}^{\enspace 2}={c}_{+}^{\enspace 2}\frac{4}{{k}_{+}^{\enspace 2}}+{c}_{{\times}}^{\enspace 2}\frac{4}{{k}_{{\times}}^{\enspace 2}}.\end{equation} \tag{ A.2 }$

If a signal is present into data, S₊ and S_× have non-central χ² distributions (unless the factor k_+/×) with two degrees of freedom. Although there is no simple expression for the distribution of S, the mean value in this case is:

$\begin{equation}{\mu }_{\mathrm{s}\mathrm{i}\mathrm{g}}={c}_{+}\frac{2+{\beta }_{+}}{{k}_{+}}+{c}_{{\times}}\frac{2+{\beta }_{{\times}}}{{k}_{{\times}}}\end{equation} \tag{ A.3 }$

where

$\begin{equation}{\beta }_{+/{\times}}=2\frac{{H}_{0}^{2}\vert {\mathrm{e}}^{j{{\Phi}}_{0}}{H}_{+/{\times}}{\mathbf{A}}^{+/{\times}}{\vert }^{2}}{{\sigma }^{2}\cdot {T}_{\mathrm{o}\mathrm{b}\mathrm{s}}}={H}_{0}^{2}\vert {H}_{+/{\times}}{\vert }^{2}{k}_{+/{\times}}.\end{equation} \tag{ A.4 }$

The critical ratio (CR) for the single pulsar statistic is:

$\begin{equation}\mathrm{C}\mathrm{R}=\frac{{\left({c}_{+}\frac{{\beta }_{+}}{{k}_{+}}+{c}_{{\times}}\frac{{\beta }_{{\times}}}{{k}_{{\times}}}\right)}^{2}}{\left({c}_{+}^{\enspace 2}\frac{4}{{k}_{+}^{\enspace 2}}+{c}_{{\times}}^{\enspace 2}\frac{4}{{k}_{{\times}}^{\enspace 2}}\right)}.\end{equation} \tag{ A.5 }$

We need to solve the following system to maximize the CR:

$\begin{equation}\begin{cases}\frac{\partial \left(\mathrm{C}\mathrm{R}\right)}{\partial {c}_{+}}=\frac{2\frac{{\beta }_{+}}{{k}_{+}}\left({c}_{+}^{\enspace 2}\frac{4}{{k}_{+}^{\enspace 2}}+{c}_{{\times}}^{\enspace 2}\frac{4}{{k}_{{\times}}^{\enspace 2}}\right)-2{c}_{+}\frac{4}{{k}_{+}^{\enspace 2}}\left({c}_{+}\frac{{\beta }_{+}}{{k}_{+}}+{c}_{{\times}}\frac{{\beta }_{{\times}}}{{k}_{{\times}}}\right)}{{\left({c}_{+}^{\enspace 2}\frac{4}{{k}_{+}^{\enspace 2}}+{c}_{{\times}}^{\enspace 2}\frac{4}{{k}_{{\times}}^{\enspace 2}}\right)}^{2}}=0\quad \\ \frac{\partial \left(\mathrm{C}\mathrm{R}\right)}{\partial {c}_{{\times}}}=\frac{2\frac{{\beta }_{{\times}}}{{k}_{{\times}}}\left({c}_{+}^{\enspace 2}\frac{4}{{k}_{+}^{\enspace 2}}+{c}_{{\times}}^{\enspace 2}\frac{4}{{k}_{{\times}}^{\enspace 2}}\right)-2{c}_{{\times}}\frac{4}{{k}_{{\times}}^{\enspace 2}}\left({c}_{+}\frac{{\beta }_{+}}{{k}_{+}}+{c}_{{\times}}\frac{{\beta }_{{\times}}}{{k}_{{\times}}}\right)}{{\left({c}_{+}^{\enspace 2}\frac{4}{{k}_{+}^{\enspace 2}}+{c}_{{\times}}^{\enspace 2}\frac{4}{{k}_{{\times}}^{\enspace 2}}\right)}^{2}}=0.\quad \end{cases}\end{equation} \tag{ A.6 }$

The two equations are not independent. There is a straight line in the plane (c₊, c_×) where the function CR has the maximum value. This line is identified by the equation:

$\begin{equation}{c}_{+}={\beta }_{+}{k}_{+}\frac{{c}_{{\times}}}{{\beta }_{{\times}}{k}_{{\times}}}=\vert {H}_{+}{\vert }^{2}\vert {\mathbf{A}}_{+}{\vert }^{4}\frac{{c}_{{\times}}}{\vert {H}_{{\times}}{\vert }^{2}\vert {\mathbf{A}}_{{\times}}{\vert }^{4}}.\end{equation} \tag{ A.7 }$

In this case, you can choose arbitrarily that:

$\begin{equation}{\bar{c}}_{+}=\vert {H}_{+}{\vert }^{2}\vert {\mathbf{A}}_{+}{\vert }^{4}\quad \text{and}\quad {\bar{c}}_{{\times}}=\vert {H}_{{\times}}{\vert }^{2}\vert {\mathbf{A}}_{{\times}}{\vert }^{4}.\end{equation} \tag{ A.8 }$

In the same way, it is possible to infer the best CR choice for the coefficients a_i of the ensemble statistic t, defined as:

$\begin{equation}t=\sum\limits _{i=1}^{{N}_{\mathrm{s}}}\enspace {a}_{i}{S}_{i},\end{equation} \tag{ A.9 }$

where N_s is the number of pulsars and S_i is the single pulsar statistic defined as:

$\begin{equation}{S}_{i}={b}_{+,i}{S}_{+,i}+{b}_{{\times},i}{S}_{{\times},i}\quad \text{with}\quad {b}_{+,i}\equiv {\bar{c}}_{+}\quad {b}_{{\times},i}\equiv {\bar{c}}_{{\times}}.\end{equation} \tag{ A.10 }$

In this case, the CR is:

$\begin{equation}\mathrm{C}\mathrm{R}=\frac{{\left[{\sum }_{j=1}^{{N}_{\mathrm{s}}}\enspace {a}_{j}\left({b}_{+,j}\frac{{\beta }_{+,j}}{{k}_{+,j}}+{b}_{{\times},j}\frac{{\beta }_{{\times},j}}{{k}_{{\times},j}}\right)\right]}^{2}}{{\sum }_{i=1}^{{N}_{\mathrm{s}}}\enspace {a}_{i}^{2}\left({b}_{+,i}^{\enspace 2}\frac{4}{{k}_{+,i}^{\enspace 2}}+{b}_{{\times},i}^{\enspace 2}\frac{4}{{k}_{{\times},i}^{\enspace 2}}\right)}=\frac{{\left[{\sum }_{j=1}^{{N}_{\mathrm{s}}}\enspace {a}_{j}{\lambda }_{j}\right]}^{2}}{{\sum }_{i=1}^{{N}_{\mathrm{s}}}\enspace {a}_{i}^{2}{{\Sigma}}_{i}^{\enspace 2}},\end{equation} \tag{ A.11 }$

with

$\begin{equation}{\lambda }_{j}\equiv {b}_{+,j}\frac{{\beta }_{+,j}}{{k}_{+,j}}+{b}_{{\times},j}\frac{{\beta }_{{\times},j}}{{k}_{{\times},j}}\quad \text{and}\quad {{\Sigma}}_{j}^{\enspace 2}\equiv {b}_{+,j}^{\enspace 2}\frac{4}{{k}_{+,j}^{\enspace 2}}+{b}_{{\times},j}^{\enspace 2}\frac{4}{{k}_{{\times},j}^{\enspace 2}}.\end{equation} \tag{ A.12 }$

Maximizing the CR for the coefficients a_k, we find:

$\begin{equation}\frac{\partial \left(\mathrm{C}\mathrm{R}\right)}{\partial {a}_{k}}=\frac{2{\lambda }_{k}\left({\sum }_{j=1}^{{N}_{\mathrm{s}}}\enspace {a}_{j}{\lambda }_{j}\right)}{{\sum }_{i=1}^{{N}_{\mathrm{s}}}\enspace {a}_{i}^{2}{{\Sigma}}_{i}^{2}}-\frac{2{a}_{k}{{\Sigma}}_{k}^{2}{\left({\sum }_{j=1}^{{N}_{\mathrm{s}}}\enspace {a}_{j}{\lambda }_{j}\right)}^{2}}{{\left({\sum }_{i=1}^{{N}_{\mathrm{s}}}\enspace {a}_{i}^{2}{{\Sigma}}_{i}^{2}\right)}^{2}}=0.\end{equation} \tag{ A.13 }$

As in the case of single pulsar, there is a hyperplane where the function $\mathrm{C}\mathrm{R}\left({a}_{1},\dots ,{a}_{{N}_{\mathrm{s}}}\right)$ has the maximum value,

$\begin{equation}{a}_{k}=\frac{{\lambda }_{k}}{{{\Sigma}}_{k}^{2}}\frac{\left({\sum }_{i=1,i\ne k}^{{N}_{\mathrm{s}}}\enspace {a}_{i}^{2}{{\Sigma}}_{i}^{2}\right)}{\left({\sum }_{j=1,j\ne k}^{{N}_{\mathrm{s}}}\enspace {a}_{j}{\lambda }_{j}\right)},\end{equation} \tag{ A.14 }$

and a particular, simple choice is:

$\begin{equation}{\bar{a}}_{k}=\frac{{\lambda }_{k}}{{{\Sigma}}_{k}^{\enspace 2}}=\frac{{H}_{0,k}^{2}}{{\sigma }_{k}^{4}\cdot {T}_{\mathrm{o}\mathrm{b}\mathrm{s}}^{2}}.\end{equation} \tag{ A.15 }$

Appendix B.: Pulsars parameters

(See table B1 and B2).

A method for detecting continuous gravitational wave signals from an ensemble of known pulsars

Article metrics

Submit

Permissions

Author e-mails

Author affiliations

Author notes

ORCID iDs

Dates

Peer review information

Abstract

1. Introduction

2. The gravitational wave signal

3. Single signal and ensemble statistics

4. Approximate expressions for $\bar{S}$ and $\bar{t}$ coefficients

5. Scaling with the number and amplitude of signals

6. Method for real analysis

7. Conclusion

Data availability statement

Acknowledgments

Appendix A.: Theoretical coefficients

Appendix B.: Pulsars parameters

Footnotes

A method for detecting continuous gravitational wave signals from an ensemble of known pulsars

Article metrics

Submit

Permissions

Share this article

Author e-mails

Author affiliations

Author notes

ORCID iDs

Dates

Peer review information

Abstract

1. Introduction

2. The gravitational wave signal

3. Single signal and ensemble statistics

4. Approximate expressions for \bar{S} and \bar{t} coefficients

5. Scaling with the number and amplitude of signals

6. Method for real analysis

7. Conclusion

Data availability statement

Acknowledgments

Appendix A.: Theoretical coefficients

Appendix B.: Pulsars parameters

Footnotes

4. Approximate expressions for $\bar{S}$ and $\bar{t}$ coefficients