1 Introduction

The ability of stellar activity to frustrate the recovery of planetary signals was quickly investigated following the first announcement of a planet orbiting a solar-like star by Mayor and Queloz [38]. Saar and Donahue [49] estimated starspot and convective contributions to the measured stellar radial velocities (RVs) in stars with different rotation and activity levels. The relationship between activity indicators such as Ca II H&K emission and projected RV variability has also been explored in a number of studies (e.g. [36, 56]). For the most active stars, our ability to recover planet induced RVs can be severely restricted [32]. The impact of starspots on the shape of stellar absorption lines was investigated by Santos et al. [50] and Desort et al. [21]. Further studies have also focused on correcting for starspot induced RV activity; Boisse et al. [12] built on early work by Desort et al. [21], showing that with well sampled observations, line bisectors can be used to mitigate stellar RV contributions. More sophisticated Gaussian Process modelling has been shown to be a powerful tool for disentangling activity signals from RVs [34, 45].

For quiet stars, stellar oscillation and granulation effects become important factors that limit precision. These effects can be partially alleviated through careful observation strategies [24]. Using high precision photometric Kepler data, the study by Bastien et al. [7] (B13) identified a correlation between the asteroseismic 8 h variability or ‘flicker’ and the surface gravity of a star. The lowest activity targets, identified via the amplitude of 90-day variability, show a clear sequence with the 8 h granulation variability, dubbed the flicker-floor. The subsequent study by Cegla et al. [17] (C14) obtained a relationship expected between RV uncertainty and the 8 h flicker, F8, for quiet stars. Importantly, RV uncertainties, typically of magnitude 1–2 ms−1, were shown to dominate over the sub-ms−1 spot induced RV uncertainties, as previously indicated by Dumusque et al. [24, 25].

The time required to obtain exoplanet masses for a sample of targets at a given precision with ground-based RV facilities depends on a number of factors. In practice, each system must be treated individually since stellar rotation, activity and interactions with planet periodicities in individual systems are generally not known in advance. While instrument stability is a limiting factor for low-activity stars, the time commitment also depends on host star magnitude, which will limit precision for feasible exposure times on fainter targets. Despite the many variables, it is nevertheless possible to combine knowledge of instrumental characteristics and prior knowledge of typical stellar activity factors to obtain an estimate of the time commitment for a dedicated survey. A detailed simulation that quantified the observation effort required for the RV characterisation of planets found with TESS [48] was made by Cloutier et al. [19] (C18). This study considered the various noise sources discussed above to obtain estimates of the RV uncertainty for simulations using white noise and correlated noise models.

We aim here to assess the time required to recover masses for a large sample of targets to be studied by the European Space Agency’s M4 space mission, ArielFootnote 1 [26]. Since Ariel aims to characterise the atmospheres of ~1000 targets, many system parameters will need to be determined in advance. For the purposes of a follow-up RV survey of transiting exoplanets, we are aided by knowing the planetary orbital period and transit times a priori. Considerable efforts to improve orbital periods and maintain ephemerides are being made, including the use of small telescopes operated by citizen scientists [57]. This will be essential for ultimate scheduling of observations with Ariel, and will also aid complementary RV observations. With good ephemerides, we do not need to perform a completely blind period search and do not have to worry about the usual issues of identifying the correct planet orbital period where periodogram peak aliases of similar significance are present. RV observing strategies can thus be optimised accordingly.

We use the sample of potential Ariel targets identified and simulated by Edwards et al. [27]. Specifically, we used a revised list of ~2100 targets.Footnote 2 E19 contains a full description of the target sample and has been used extensively in the design stage for the Ariel mission.Footnote 3 In §2, we begin by simulating astrophysical noise and estimating the instrumental and photon-noise-limited uncertainties. The method of simulating RVs incorporating the planet induced velocities for selected instrument and astrophysical noise combinations is presented in §3, where we describe the adopted optimised observing strategy and signal recovery procedure, resulting in mass uncertainty estimates for the simulated planets. In §4 the results are presented and summarised. The predictive power of the simulations and optimisation efficiencies are explored in §5 before concluding remarks and discussion in §6.

2 Noise sources

The total RV error budget can be modelled as σtot= \( \sqrt{\sigma_{\mathrm{floor}}^2+{\sigma}_{\mathrm{RV}}^2+{\sigma}_{\mathrm{act}}^2+{\sigma}_{\mathrm{planets}}^2} \) (C18), where σfloor is the instrumental limiting precision, σRV is the photon noise limited precision, σact the activity contribution, and σplanets the additional effect of unidentified planets in the system. Many of the close-orbiting giant planet systems are single-planet systems owing to migration mechanisms [40]. Surviving terrestrial planets, when they do exist, tend to contribute only a small fraction of the RV signal (e.g. [55]). For the remainder of our simulations, we do not consider σplanets. Nevertheless, for the 2-planet systems in E19, we simulate and model recovery of both planets. Instrumental precision and astrophysical noise are the main sources of measurement uncertainty in RV observations. For fainter targets, the limiting instrumental precision, σfloor, is not achieved, and photon noise, σRV,begins to dominate the instrumental error budget. We consider σact, σfloor and σRV below.

2.1 Astrophysical noise models

Activity sources are usually not uniformly spatially distributed on the stellar surface, thus leading to time dependent modulation of the RVs. These RV variations due to astrophysical noise are commonly modulated on timescales related to the rotation period of the star and the possible magnetic activity cycles exhibited by the star. Various contributing factors have been studied in detail for the Sun by Meunier et al. [39]. For the Sun, the greatest variability in RVs was found to be caused by changes in convective blueshift on the relatively long time scale of a solar cycle, with effects up to ~10 ms−1. On solar rotation timescales, the effects of spots and plage have a similar magnitude on the Sun. However, they partially compensate for each other with the net effect being that both spots and plage induce rotational modulation of amplitude almost equal to either component individually [39]. C14 found that RV uncertainties due to stellar oscillation and granulation were of order 1–2 ms−1 and are indicated for quiet main-sequence stars in agreement with the investigations of Dumusque et al. [24]. Estimates by C14 for the Sun indicated RV uncertainties of order 1 ms−1, somewhat higher than the starspot-only induced estimates of 0.4–0.5 ms−1 [25, 39]. This finding shows that the stellar oscillation and granulation effects dominate and must be accounted for when searching for the lowest mass planets, where the planet induced velocity semi-amplitude is of comparable magnitude.

C18 use parametric models for the 8 h flicker, F8 (B13), and log R’HK to predict the astrophysical noise, \( {\sigma}_{\mathrm{act}}^2 \). Here, we model cool spot distributions for assessing and modelling astrophysical noise. The solar spot distribution models derived from observations by Bogdan et al. [11] were adapted by Solanki [52] to obtain spot distributions for higher activity levels. We use these models to obtain appropriate spot-contaminated absorption line profiles. Spot distributions are assumed to be fixed during the span of observations. Forward modelling from image to line profiles was performed using the Doppler imaging code, DoTS [20]. Noise models were initially considered for the three lowest activity cases modelled by Solanki [52]; equivalent to solar minimum, solar maximum and super-solar maximum. These models have spot coverage of 0.02%, 0.3% and 2% respectively and are referred to as Models 1, 2, and 3 respectively in Fig. 1.

Fig. 1
figure 1

Spot induced noise, σact, spot, for different levels of activity. Model 1 (solar minimum), Model 2 (solar maximum), Model 3 (extreme solar maximum). For illustration, stars with three different masses are simulated, each with two rotation periods: Prot = 10 d and 25 d. The corresponding equatorial rotation velocities (which also limit precision) are indicated in the key. The HARPS and a conservative ESPRESSO limiting stabilities are indicated, along with \( {\sigma}_{\mathrm{act},{F}_8} \) ~ 1.6 ms−1

The resulting r.m.s. spot induced RV variation as a function of stellar temperature and rotation period Prot and corresponding v sin i is shown in Fig. 1. Three spectral types are considered with different spot contrasts: G2V (1.0 Mʘ, Teff = 5750 K, Tspot = 4000 K), K2V (0.75 Mʘ, Teff = 5000 K, Tspot = 3750 K) and M2V (0.5 Mʘ, Teff = 3750 K, Tspot = 3000 K). Spot temperature contrasts are estimated from Berdyugina [9]. For the G2V and K2V stars, spot distributions are simulated as solar-like, appearing at low latitudes, while spot distributions are spread at all latitudes for the M2V case (see Barnes and Collier Cameron [5]), Barnes et al. [6] and reviews by Berdyugina [9] and Strassmeier [54]).

Figure 1 shows that for solar minimum, the spot contribution, σact, spot, is <1 m s−1 for all models. For activity equivalent to solar maximum and greater (Models 2 & 3), the astrophysical noise will be more significant for G2V and K2V cases, contributing either a similar degree or more noise than the instrumental/photon noise contributions. However, the lower contrast and more distributed spot patterns for the M2V model reveal that noise from spots is still relatively low for Model 2. For improved stability on new generation instruments such as ESPRESSO, with stability down to 0.1 ms−1, and even conservative 0.5 ms−1 precision estimates [41], the spot noise is more important for stars with activity similar to solar minimum levels.

Since our solar minimum model is likely representative of a quiet star, we must take into account the RV uncertainty due to the 8 h flicker floor (C14). Although the Sun appears to be quieter than typical main sequence stars (B13), with \( {\sigma}_{\mathrm{act},{F}_8} \) ~ 1.3 ms−1 (C14), we estimate a more conservative, typical value of \( {\sigma}_{\mathrm{act},{F}_8} \) ~ 1.6 ms−1. Our complete noise model, σact= \( \sqrt{\sigma_{\mathrm{act},\mathrm{spot}}^2+{\sigma}_{\mathrm{act},{F}_8}^2} \), thus comprises a rotationally modulated starspot term (we fixed this at 25 d) and a white noise, F8 term. Although F8 does not change significantly for the Sun during its activity cycle, variability on longer timescales does change. This is also seen for more active stars above the flicker floor (C13). Our spot models inform longer timescale variability and include the modulation associated with stellar rotation. We use a fixed value of \( {\sigma}_{\mathrm{act},{F}_8} \) ~ 1.6 ms−1 for all stars that we simulate.

2.2 Instrumental noise

The limiting instrumental precision from bright sources is fixed by the instrument design, while photon noise limits are discussed by Bouchy et al. [14]. Exposure time is an important consideration requiring a compromise to enable RV measurements with as small uncertainty as possible on a reasonable timescale. Our simulations consider observations with HARPS and ESPRESSO, though ultimately, a variety of instruments will be employed for RV follow-up observations.

It has been shown that optimal exposure durations for typical cool G and K main-sequence stars are expected to be ~4–20 min [18]. For the purposes of our study, we assume minimum exposure times of 900 s. For all targets with visual magnitude, V < 9, we expect to achieve the σfloor = 1 m s−1 limit with exposure times of 900 s. We can still reach the σfloor = 1 m s−1 for stars up to V ~ 10 with longer exposure times. The HARPS and ESPRESSO exposure time calculatorsFootnote 4 enable an estimate of the precision for a specified exposure time. Table 1 details the exposure times assumed and adopted precisions. By V = 13, exposures of 3600 s (1 h) are expected to yield effective photon noise limited precision of \( {\sigma}_{\mathrm{obs},\mathrm{inst}}=\sqrt{\sigma_{\mathrm{floor}}^2+{\sigma}_{\mathrm{RV}}^2} \) ~ 3.4 ms−1. This rises to 4.1 for V = 14, but a total combined exposure time of 7200 s is required; a significant commitment for a single observation.

Table 1 Instrumental and photon noise precision limits used in the simulations. The apparent visual magnitude, V, adopted exposure time and resulting precision from the HARPS and ESPRESSO exposure time calculators are listed

For 1 m s−1 precision on a 4 m class telescope, the RV noise and hence mass uncertainties will be dominated by σobs, inst or \( {\sigma}_{\mathrm{act},{F}_8} \) in the solar minimum (Model 1) case; the limits in Table 1 are greater than the solar minimum noise shown in Fig. 1. With ESPRESSO, the astrophysical spot induced noise for the solar minimum models becomes important for the brighter targets, though \( {\sigma}_{\mathrm{act},{F}_8} \) will hamper gains from the improved instrument stability. Once activity increases to solar maximum levels (Model 2), the spot-induced RV jitter is much more significant. Still higher levels of activity (Model 3), though not used in the following simulations, would mean that spot noise becomes the dominant source of noise for most targets. Only for an M dwarf with a solar-like rotation period is the effect moderate.

The chromospheric S-index values of over 4400 stars compiled by Boro Saikia et al. [13] indicates most stars have solar minimum activity levels, with a significant proportion of stars in a solar maximum state. The recalibration of this sample by Doherty [23] reduces the proportion with activity levels similar to, or exceeding, the solar maximum. The Boro-Saikia sample is derived largely from RV survey observations, and thus some higher activity stellar hosts may be excluded, but in the simulations that follow we consider only the solar minimum and maximum cases (Models 1 and 2).

3 The simulations

We have carried out simulations for the representative sample of planet hosting stars published by E19. This large, diverse sample contains known and simulated planets with atmospheric signals that could be characterised by Ariel. We simulated E19 targets with V < 14.

3.1 Observing strategy

We assume that observations are made over a single span of time. RV campaigns that search for and measure the masses of low mass planets have had considerable success with intensive monitoring campaigns to augment historical multi-year campaigns. This strategy led to the discovery of Proxima Cen b [1]. Informed by this, a nightly observation per target over a 13 month timespan has been adopted for the Red Dots programme and has subsequently led to the discovery of further planets around some of the nearest M dwarf hosts [22, 37, 47]. Intensive monitoring of higher mass targets has also led to the efficient discovery of hot and warm planets in single and compact multiplanet systems [3, 31, 53]. This approach contrasts with the historically typical strategy of a few observations per observing run with runs spread over multiple years. These intensive campaigns have the benefit of reducing aliasing in period searches while minimising the effects of activity evolution from the host star.

3.2 Optimisation with phased observations

Other studies, such as the simulations of C18, have aimed at assessing the observation commitment required for RV retrieval of planetary parameters for characterisation of TESS planets. The purpose of our study is to investigate efficient recovery of planetary masses for the Ariel mission for systems that are already well characterised. This is possible because the ephemerides and periods of the planets that will be adopted for study by Ariel will be known in advance, though the procedure is applicable elsewhere. In this situation, we have effectively two unknown or poorly constrained parameters if we assume purely circular orbits; the stellar mass and the semi-amplitude of the RV variations. Radial velocity observations for each planet can then be simulated at quadrature points when the maximum RV semi-amplitude of each planet is expected. Our simulations trialled a minimum of 2 quadrature points (phases 0.25 and 0.75). Observations at phases 0.00 and 0.50 were then added, giving 4 observations. This helps to recover the correct injected period more quickly compared with only adding further quadrature observations. The study by Burt et al. [16] found uniform and random sampling to be more effective. We investigate and discuss this issue in more detail for our sample in §5.2. Further points were then similarly added in pairs such that a data set contained Nobs-trial = 2, 4, 6, 8, 10, 12, 16, 20...100 (in steps of 10) and 120–240 (in steps of 20), 300 & 360. For two-planet systems, a minimum of Nobs-trial = 4 is required.

3.3 Simulated radial velocities and mass recovery

For each planet, astrophysical noise was added to the planetary RVs from the most appropriate solar minimum and solar maximum models described in §2.1, i.e., using G2V, K2V or M2V. Throughout we adopted a stellar rotation period of Prot = 25 days, yielding projected equatorial rotation velocity of respectively 2.0 km s−1, 1.5 km s−1 and 1.0 kms−1. Additional asteroseismic noise for quiet stars is also added, as described in §2.1. Instrumental noise appropriate for each target, observed with a HARPS-like instrument was also included (§2.2 and Table 1). For planets with Mp < 10 M, ESPRESSO observations were also considered.

No attempt was made to model and subsequently correct for the rotation induced correlated RV noise due to the starspot features. Thus, we are assessing the full impact of both activity and instrumental noise sources. The mass uncertainty for each sequence with Nobs-trial observations was determined from posterior sampling via MCMC using the Radial Velocity Fitting Toolkit, Radvel [29]. We obtained posterior mass estimates by imposing 10% planetary orbital period and transit time priors. In reality, these parameters will be somewhat better determined for Ariel targets, potentially further improving retrieval efficiencies. All planets were simulated with circular orbits and we used an eccentricity prior for recovery, requiring e < 0.05.

Using an uncertainty in stellar mass of 3%, the uncertainty in mass for each planet and each trial data set with Nobs-trial observations was determined. We ran data sets with increasing Nobs-trial (as defined in §S 3.2) for each system until ΔMp ~ 20% was achieved. It is possible for MCMC chains to locate the wrong area of parameter space and give a misleading mass estimate, but still with the required precision. We thus required that the recovered planet mass is obtained to the same tolerance as the posterior mass uncertainty estimate. e.g. for a required uncertainty of ΔMp ~ 50%, we also require the recovered mass matches the simulated mass to ~50%.

4 Results

We present estimates of the number of observations and time required to complete the observations for fixed planetary mass precisions. The study by Batalha et al. [8] found that only cloud-free low-metallicity gas giants enable atmospheric characterisation without prior knowledge of the planet’s mass. Atmospheric properties could only be inferred in all other cases, including terrestrial-like planets, when the mass is known a priori to better than 50%. This is a minimum requirement. It was found that to avoid uncertainties in atmospheric properties being dominated by uncertainties in mass, a greater planet mass precision of 20% is needed. The atmospheric properties were then found to be dominated by the spectroscopic data quality. We thus present detail results for the two cases in our simulations where the criteria of ΔMp ~ 50% and ΔMp ~ 20% are met. We also estimate summary time commitments for other values of ΔMp.

The number of observations required for each simulated target are shown in the plots of Porb vs Mp in Fig. 2. There are a total of 1959 targets in E19 with V < 14. The majority of targets only require a minimum of 2 or 4 observations to obtain masses with ΔMp ~ 50% or 20%. The 23 two-planet systems in the E19 sample are simulated and recovered as two-planet systems (open circles in Fig. 2). Only 915 very low mass planets are not recovered and would require an unreasonably large number of observations, Nobs > 360. As expected, Fig. 2 shows that more observations are required to obtain a given percentage mass uncertainty for the less massive planets. Table 2 gives a detailed breakdown of the sample split into magnitude bins corresponding to Table 1 and scaled to the expected Ariel sample of 1000 planets/planet systems.

Fig. 2
figure 2

The number of observations, Nobs, required to recover each simulated planet from E19. The four upper panels show results for solar minimum (left; Model 1) and solar maximum (right; Model 2) activity models for all targets with V < 14 for recovery with ΔMp ~ 50% (top) and with ΔMp ~ 20% (bottom). Planets in two-planet systems are indicated by open circles; in these cases, Nobs indicates the observations needed for recovery of both planets. The six lower panels highlight the results shown in the four upper panels for single stars with ΔMp < 10 M for ΔMp ~ 50% and ΔMp ~ 20%. The left panels show recovery with assumed ESPRESSO-like precision for the solar minimum case. The total number of recovered planets, Nrec, and unrecovered planets, Nnot (i.e., where >360 observations are required), are indicated in each panel. The planets from E19 with Mp < 0.5 M are not shown

Table 2 Total number of observations, and RV time required in nights for a simulated Ariel sample of 1000 targets. The results are tabulated for mass precisions of ΔMp ~ 50% and ΔMp ~ 20% for solar minimum and solar maximum stellar activity models (Models 1 & 2). The quantities in each magnitude bin (see Table 1 for corresponding instrument precisions) are scaled from the full simulated E19 sample. The final row (bold type) gives the statistics for the scaled 1000 target sample with visual magnitude, V < 14. Tabulated quantities are the number of targets, Ntot, the number of recovered planets (i.e. where Nobs ≤ 360), Nrec, the total number of observations, Nobs,tot, and the total time [n]. One observing night [n] is equivalent to 9 h. The estimated nights assume 300 s overhead for every single observation (i.e. telescope slew, target acquisition and readout times) and allow for 20% time lost due to bad weather

The lower panels of Fig. 2 compare three scenarios for single planets with ΔMp < 10 M: the left panels show Nobs for the solar minimum case, where observations are made with ESPRESSO while the middle and right panels assume respectively solar minimum and maximum activity, but with observations made by HARPS (these are simply expanded regions of the upper panels of Fig. 2, but for single planets only).

The time required to recover masses for these targets is summarised in Table 3. Both Tables 2 & 3 include observing overheads and time lost due to bad weather (see Table 2 caption for details).

Table 3 As for the final row of Table 2, but for single-planet systems with ΔMp < 10 M within an Ariel sample of 1000 scaled to the same proportion as they occur in the E19 sample. The first row contains results for the simulation using an instrument with the precision of ESPRESSO (see Table 1). The sample comprises 20 targets with 5 ≤ V < 10, 39 targets with 10 ≤ V < 12 and 13 targets with 13 ≤ V < 14

The time commitments are considerable for a complete sample of 1000 targets. At solar minimum, we find 1.61 yrs. (ΔMp ~ 50%) and 2.96 yrs. (ΔMp ~ 20%) of 9 h nights are required (i.e. one day is a 9 h observing night, so here, 1 year signifies 365 × 9 h nights; see Fig. 2 caption for details). For solar maximum, this rises to 4.16 yrs. and 8.95 yrs. In other words, compared with solar minimum activity, stars with solar maximum activity levels would increase the time requirement by tsolmax/tsolmin ~ 1.8 (ΔMp ~ 50%) – 2.2 (ΔMp ~ 20%). If observations are restricted to stars with V < 13, the total time commitment for the solar minimum activity simulation is reduced by 20% - 25%. For solar maximum the saving is 14% - 22%. To some extent, the saving in the latter case is lower because some faint targets, already requiring large Nobs, for the solar minimum simulation are not recovered with Nobs ≤ 360 at solar maximum.

For the brightest targets (i.e. the 145 stars with V < 10 in Table 2), the dominant noise sources for the solar minimum case observed with HARPS are σfloor = 1 m s−1 and \( {\sigma}_{\mathrm{act},{F}_8}^2 \) = 1.6 m s−1. The noise sources, including σact, spot, combine in quadrature to give noise of σtot ~ 1.9 ms−1. For solar max, we expect σact, spot = 1.5–2.5 m s−1 (K-G stars) to yield σtot ~ 2.5–3.1 ms−1, a factor of 1.3–1.6 times greater than the solar minimum case. We might expect a roughly linear scaling of the increase in time required in the case for white noise, since ΔMp ~ ΔK. For the brightest stars, simulations show time factors of tsolmax/tsolmin= 3.7 (ΔMp ~ 50%) and 2.9 (ΔMp ~ 20%). It seems that the simulated 25 d periodicity of σact, spot further frustrates recovery of the planet for these stars. For the fainter targets, where the of σact, spot contribute equally or less than other noise sources to σtot, the respective ratios are tsolmax/tsolmin = 1.7 and 2.1.

Assuming the same proportion of low mass planets for an Ariel sample as found in the E19 sample indicates recovery of ~69–70 single planet systems with ΔMp < 10 M (Table 3) from a total of 72 systems. A comparison with the estimate in Table 2 reveals that a large proportion of time is spent on the ~7% of planets with ΔMp < 10 M. For solar minimum, 17% of the total time is spent recovering masses to ΔMp ~ 50%, while the fraction rises to 37% for ΔMp ~ 20%. For solar maximum the respective fractions are 26% for ΔMp ~ 50% and 31% for ΔMp ~ 20%.

Although an instrument such as ESPRESSO achieves a somewhat greater precision (Table 1), it is likely that without the kind of strategy investigated by Dumusque et al. [24], stellar noise will dominate, even in quiet stars where \( {\sigma}_{\mathrm{act},{F}_8}^2 \) is typically expected to be ~1–2 m s−1. Despite this limitation, an instrument achieving precision levels similar to ESPRESSO will still be ~1.9 times more efficient at recovering masses to ΔMp ~ 50% in solar minimum activity stars. This rises to ~2.5 times for ΔMp ~ 20%. The strategy of observing a target 2–3 times per night over several hours proposed by Dumusque et al. [24] could improve sensitivity further with ESPRESSO, but would significantly increase the time commitment. Further simulations would be required to determine whether this strategy might be effective for recovery of the very lowest mass targets that otherwise escape detection with HARPS-like sensitivity (e.g. with a feasible number of observations). Despite the clear benefits (especially on fainter targets) of instruments achieving sub-ms−1, the inevitable time pressure will likely limit the targets surveyed to those with the very lowest σact.

Using the full E19 sample of 1959 targets, we made power-law fits of the total time required, tTOT, as a function of ΔMp for each simulation. Figure 3 shows these fits scaled to a 1000 target sample for the solar minimum and solar maximum cases presented in Table 2 and the 72 targets with ΔMp < 10 M in Table 3. Figure 3 gives an indication of the total time, tTOT, required with overheads and 20% weather loss for each sample. Since MCMC runs for large numbers of targets are computationally expensive, we truncated the simulation for each target once ΔMp ~ 20% was reached. Nevertheless, for ESPRESSO, we simulated all targets to ΔMp ~ 10%, but all other estimates in the 10 ≤ ΔMp < 20 range are extrapolations.

Fig. 3
figure 3

Total time, tTOT, including overheads and 20% weather loss, for various precisions in addition to those presented in Tables 2 & 3, where 1 year is equivalent to 365 × 9 h nights. The points are extracted directly from the simulations while the solid lines show power law fits

5 Discussion

The predictive power of the simulations for individual targets is not easily made owing to the optimised nature that we adopted. C18 made a comparison between their calculated number of observations and those published for a number of systems (see Table 2 in C18), showing a very close match between model and published value. A range of systems with different planet masses were characterised using between 4 and 218 observations. Many of these systems required of order 10–40 observations. Very precise masses from a few per cent to a few 10s of percent were typically recovered. Our requirement of only ΔMp ~ 50% and ΔMp ~ 20% for Ariel is generally more relaxed. Simple scaling approximations show that many of the hot Jupiter masses considered by C18 can easily be recovered to this precision with only a minimum of 2–4 observations. This agrees well with our findings.

5.1 Two examples with low amplitude planet signals

We compare the number of epochs, Nobs, required for two targets where the stellar velocity semi-amplitude, K, is of similar magnitude to the noise.

HD 97658 b, orbits an inactive (log R’HK ~ −5) 0.78 Mʘ star with P ~ 9.49 d [35]. It induces a velocity amplitude of K ~ 2.75 m s−1; the estimated minimum mass of HD 97658 b is 8.2 M. The 2.78 r.m.s. excess for this target reported by Howard et al. was greater than their assumed 1.5 ms−1 jitter. HD 97658 b was recovered by Howard et al. [35] with 96 Keck/HIRES observations, yielding ΔMp = 15%. We ran a simulation, using the original observation times and σtot ~ 2.78 ms−1, which yielded exactly the same ΔMp ~ 15% as found by Howard et al. Running our solar minimum model with quadrature sampling and a ΔMp ~ 15% requirement, indicates a modest improvement, requiring Nobs between 80 and 90.

GJ 1132 b, a 1.62 M planet orbiting a 0.18 Mʘ star in a 1.63 d orbit has already been recovered with HARPS by Berta-Thompson et al. [10] with 34% uncertainty. A total of 25 observations were used. Our optimised approach also indicates that 25 observations correctly recovers the mass to ΔMp ~ 34% for this system for solar minimum activity. Here, as with HD 97658, K ~ σtot. Moreover, in this instance, we simulated daily observations, finding that 25 observations are required to recover the mass to the same precision. The relatively short period of 1.63 d means that daily sampling and quadrature sampling are likely to build similar numbers of observations before the required mass uncertainty is obtained.

For these targets, with K / σtot ~ 1, it appears that quadrature sampling yields similar results to the real case RV sampling. A potential modest improvement is found for HD 97658 b. We investigate this further in the next section.

5.2 Optimisation efficiency

We find that the optimised approach is most effective where the stellar velocity semi-amplitude, K, is larger compared with the noise level in the data. To quantify this, we used a subset of our input catalogue that sampled the complete range of K / σtot to run additional simulations with daily sampling. Firstly, we considered all 142 low mass single planet systems with Mp < 10 M in the full E19 sample for which 0.25 < K / σtot < 3.2. Of these targets, for the solar minimum model, the 30% with K / σtot < 1 show that quadrature sampling requires 6% and 7% fewer observations than daily sampling to achieve ΔMp ~ 50% and 20%. For those targets with K / σtot > 1 (70%), the benefit is 33% if ΔMp ~ 50% is required, while only an 10% improvement is seen for ΔMp ~ 20%. Considering all low mass targets together, the improvements are 10% and 8% respectively for ΔMp ~ 50% and 20%.

A subset of the brightest targets for which Mp > 10 M, and where the instrument precision, σfloor, is reached was also considered. These comprise 22.5% of the E19 sample. For these targets, with 1.1 < K / σtot < 194, we find that observations at quadrature require only 59% (ΔMp ~ 50%) and 72% (ΔMp ~ 20%) of the number of observations required with daily sampling.

It is perhaps not surprising that the savings for the lowest mass planets, or more specifically, those with K / σtot, is modest. To achieve the desired precision, these targets require ~30 (ΔMp ~ 50%) and ~ 100 (ΔMp ~ 20%) observations on average, irrespective of the observing strategies tested. Thus, under the condition of well determined ephemerides, recovery of masses via quadrature and daily sampling are roughly equivalent in terms of time commitment. For the brightest planet hosting stars, with the most massive planets, ≤ 10 observations are typically required on average. Since the vast majority (80%) of these “easier” targets possess K / σtot > 2.3 and require ≤ 6 observations, quadrature sampling will be much easier to schedule and this worth considering.

It is likely that intense monitoring in the lowest mass targets will outweigh the modest benefit of quadrature sampling since mitigation of astrophysical noise is easier with well sampled data. Conversely, higher mass planets, where quadrature sampling offers considerable efficiency savings, are less sensitive to the systematic effects of astrophysical noise.

6 Concluding remarks

The Ariel Definition Study ReportFootnote 5 has assessed the current status of mass measurement for targets that should be included in the final sample. For 500+ planets, mass measurements for 80% (i.e. 400+) of the sample have been made to better than 50, while 65% (i.e. 325+) of the sample have been measured to better than 30%. Thus, ~ 600 targets require observations to reach ΔMp ~ 50%. If we assume that ~300 of these targets have mass estimates to ΔMp ~ 20%, then a further 700 Ariel targets will need RV observations to reach this precision. By simple scaling, obtaining new observations for the 600 targets required to reach ΔMp ~ 50% will require 0.97 yrs. (solar minimum) and 1.8 yrs. (solar maximum) of telescope time (as in previous estimates, 1 year signifies 365 × 9 h nights). For the estimated 700 targets, the respective times to reach ΔMp ~ 20% are 2.9 and 6.3 yrs. If the sample of targets studied by Boro Saikia et al. [13] and Doherty [23] is representative of the Ariel sample, we can expect more stars to exhibit solar minimum activity and thus that the true number of nights will be closer to the lower, solar minimum, estimates.

Although we have estimated that a large sample of planets could be observed with a reasonable time commitment, further efficiency savings could be made. Those planets with very well determined ephemerides will enable stronger constraints to be placed on the period and transit priors, resulting in equivalent posterior mass uncertainties with fewer observations. As already noted, citizen scientist projects aim to maintain good ephemerides for known transiting planets [57]. Although quadrature sampling may not always be easily scheduled in a traditional observing run, large samples of targets, the increasing number of dedicated survey instruments and queue scheduling mean this strategy is now more feasible.

Techniques have been developed to mitigate the effects of activity, thereby improving the efficiency with which planetary RV signals can be recovered. These include the use of simple correlations with absorption line shapes such as the line bisector span [33, 43] or similar methods [28]. Imaging techniques, where good sampling of the stellar orbit is achieved (e.g. [42]; Barnes et al. 2017) have also been shown to aid retrieval of planetary signals in the presence of noise. In addition, photometric and activity observations can be used to inform techniques; Gaussian processes, which model time varying correlated noise [30, 34, 45] have been demonstrated to be effective [22, 37]. With observations made over short temporal baselines (§3.1), identifying stellar activity signals is easier than for observations spread over many observing seasons or years.

Finally, there are a number of targets in the E14 catalogue, particularly those with V > 14 that we did not simulate simply because they are too faint for a typical 4 m telescope with an instrument like HARPS. These systems are typically the lower mass red dwarfs. While ESPRESSO may be able to recover masses for some of these systems, we will probably require RV instruments extending into the red-optical and near infrared. Although the feasibility of such instruments was demonstrated by Barnes et al. [2], CARMENES [44] until recently has been the only instrument routinely operating in this regime. MAROON-X, operating at 500–920 nm will bring the benefits of a larger light collecting capability [51]. Although Reiners et al. [46] showed that the highest RV information content and precision is obtained at <1000 nm, CRIRES+ will also likely play a vital role in recovering masses for some of the faintest systems [15]. A list of facilities is tabulated in the Ariel Definition Study Report, ESA/SCI(2020).