1 Introduction

On November 2017, the military Argentinian submarine ARA San Juan was on patrol off. Argentina’s South Atlantic coast and was transiting from Ushuaia to Mar Del Plata naval bases. It was unarmed and was performing simple maneuvering exercises with forty four sailors on board. The last transmission received from the submarine was at 10:19 UTC, on November 15, 2017. A few hours later, “the hydroacoustic component of the International Monitoring Network (IMS) detected an unusual signal near the last known position of missing San Juan submarine” (CTBTO Executive Secretary Tweet of November 23, 2017). After this date, the scheduled communications no longer took place. All forty four crew members of the ARA San Juan lost their life.

The hydroacoustic component of the IMS network consists of a series of five island-based seismic stations and six cabled hydrophone installations located in the Indian, Pacific and Atlantic Oceans. Each hydrophone station hosts a set of three hydrophones deployed at a depth of the SOFAR channel, as a small-aperture (~2 km) horizontal triangular array. The direction of arrival and the apparent velocities of broadband acoustic arrivals can be determined by correlation or beam forming techniques, therefore enhancing the detection and location capabilities of such a sparse network. CTBTO analysts and scientists located the suspicious event from the large amplitude direct arrivals detected by the two hydrophone stations HA10 (Ascension Island, mid-Atlantic Ocean) over 6000 km away, and HA04 (Crozet Island, Indian Ocean) over 8000 km away. The obtained location was provided to the international recovery mission that was quickly organized after contact with the submarine had been lost.

To support the hydroacoustic analysis and the search effort, a calibration grenade was deployed by the Argentinian Navy two weeks after the accident, on December 1, 2017 at 20:04 UTC. The well-known explosion position (45.666°S, 59.240°W) took place near the last known position of the submarine, slightly north of the main search area, at a planned depth of 38 m. Its yield was 108 kg of TNT equivalent (250 lb Torpex). The event was detected by the same hydroacoustic IMS stations, HA10 and HA04, thus confirming the capability of the IMS of locating signals originating from this area of the South Atlantic continental shelf. The provided origin time of the calibration event is notional, and corresponds to the time at which the grenade was dropped from the helicopter. Unfortunately, it cannot be used to constrain the velocity models (average group velocity over the propagation path) at the two stations.

The ARA San Juan was finally localized on November 16, 2018, one year and one day after the sinking. Thanks to their autonomous underwater remote submersible equipped with side-scan sonars, the American seabed exploration company Ocean Infinity has managed to identify the scattered debris field on a small ridge in a ravine, about 920 m deep. The coordinates of the wreckage published on the Argentinian website la nacion (www.lanacion.com.ar) and Wikipedia (https://en.wikipedia.org/wiki/ARA_San_Juan_(S-42)) are 45.950 S, 59.773 W, thus offering the opportunity to confront such ground truth data with locations calculated from signals recorded at the two remote hydroacoustic stations.

In this paper, we describe how fine-tuned frequency-dependent array processing reveals reflected and refracted late arrivals often hidden in the oceanic background noise. The relevant use of a small subset of these arrivals coupled with the joint inversion of both San Juan and calibration event unknown parameters allow to constrain with great precision the location of the wreckage. We also propose a source model consisting of two impulsive sources separated by a small distance in space and time (henceforth referred to also the ‘source pair’) from the interpretation of the first cespstral peaks of all detected arrivals, generally used to characterize underwater explosions.

In the first part of this paper, the propagation conditions of direct paths to HA10 and HA04 are studied for San Juan and calibration events. The shape, duration and spectral features of the high amplitude signals, as well as the array processing results obtained from PMCC (Cansi 1995) analysis, are presented. A first preliminary location is calculated from these two arrivals. In the second part, the lower amplitude late arrivals matching with the multiple reflections/refractions of the acoustic waves on large bathymetric structures are identified thanks to the fine-tuned detector. Only a small subset of these secondary arrivals is used to calculate a refined location, for which the different sources of uncertainties are well controlled. Finally, cepstral analysis of both San Juan and calibration event signals is presented. For the San Juan event, the slightly changing position of the first cepstral peak within very short time scales of a few tens of milliseconds allows to emphasize a clearly different behavior from that of the calibration event. We interpret it as a source effect: we think that two events occurred, corresponding to the ARA San Juan source pair.

This paper presents results in terms of location precision and source mechanism associated to the ARA San Juan loss, from only two very distant hydrophone stations at 6000 and 8000 km. It clearly demonstrates the capability of the hydroacoustic component of the IMS network to enforce the nuclear test ban treaty.

1.1 Analysis of Direct Arrivals from ARA San Juan and Calibration Events

In this first part, direct paths to hydrophone stations H10N (north triplet of Ascension Island station HA10, in mid-Atlantic Ocean) and H04S (south triplet of Crozet Island station HA04, in Indian Ocean) are studied. North triplet of Crozet Island station H04N was not operational at the San Juan event time. South triplet of Ascension Island station detected both events, but array element H10S1 was not working, preventing array processing. Signals from that station, very close to those of H10N, will not be presented in this paper.

The propagation conditions are first presented in Sect. 1.1. Then the shape, duration and spectral content of the signals will be presented and discussed in Sect. 1.2, and the PMCC detection results will be shown in Sect. 1.3. After studying the sources of uncertainties associated to the problem in Sect. 1.4, a preliminary location from direct arrivals is calculated in Sect. 1.5.

1.2 Propagation Conditions to Detecting Hydroacoustic Stations

The bathymetric map and the vertical sections along the geodesic paths to the three closest IMS hydrophone stations (H10N, H04S and H08S) are plotted in Fig. 1. The vertical profiles to H10N (Ascension Island, 6035 km) and H04S (Crozet Island, 7760 km) do not exhibit significant bathymetric structures (islands or seamounts) which could significantly block the direct propagation of H-waves (in-water generated hydroacoustic waves) to these two stations. However, Rio Grande Rise which culminates at 700 m depth in the South Atlantic Ocean partially blocks the propagation to H10N and scatters part of the energy above and around it. 3-D effects (Heaney and Murray 2009) are not studied in this paper, only basic in-the-plane scattering has been modelled. The propagation to station H08S (Diego Garcia, 12,400 km) is totally blocked by South Georgia Island, explaining why this station did not detect the events.

Fig. 1
figure 1

Bathymetric map of the area of interest (a), vertical bathymetric and sound speed profiles along the paths to H10N (b, 6035 km), H04S (c, 7760 km) and H08S (d, 12,400 km). Colors in the vertical profiles represent the sound speed. The SOFAR channel, corresponding to the minimum temperature layers (in blue), plays a very important role for the propagation of hydroacoustic waves over long distances. The 400 and 1000 m isobaths are indicated on the map (a) as black lines, they correspond to the depths at which intersections with bathymetric structures can significantly affect the propagation. While the propagation to stations H10N and H04S is not blocked (in South Atlantic, Rio Grande Rise partially blocks propagation to H10N), the propagation to H08S is totally blocked by South Georgia Island

In Fig. 1, blue colors in three the vertical sound speed profiles bottom panels correspond to depths of minimum temperatures, located around 1000 m deep in the mid-latitudes (path to H10N, and end of the path to H08S) while they sharply transition to the sea surface at the circumpolar front, in the southern latitudes (whole path to H04S and beginning of the path H08S, when tangential to Antarctica). These minimum temperature layers, known as the SOFAR (SOund Fixing and Ranging) channel, trap efficiently the acoustic energy and play the role of very efficient ducts in which incoming hydroacoustic waves can propagate over very long distances with low attenuation. Sound propagates via a series of upward- and downward-turning refractions having little interaction with the seafloor or sea surface, where most loss occurs. The attenuation due to geometric spreading is cylindrical (~1/r), making transmission significantly more efficient relative to solid-Earth phases that undergo spherical spreading loss (~1/r2). For San Juan and calibration events, the range-dependent variation of the sound speed, the thickness and the depth of the SOFAR channel, in addition to the interactions with the bathymetry, allow to explain at first order the shape, the duration and the spectral content of the signals generated by an impulsive-like source.

1.3 Signal Features of Direct Arrivals: Shape, Duration and Spectral Content

The signals and spectrograms at stations H10N and H04S for the San Juan and the calibration events are shown in Fig. 2. For each station, the spectrograms of the three array elements are time-delayed from calculated back-azimuth and slowness of direct arrivals (see array processing Sect. 1.3), and stacked. Only the 1–40 Hz frequency band is shown, where spectral amplitudes are the largest at both stations, and in which spectrum scalloping is the most visible. The signals recorded at H10S are not shown since they exhibit features very close to those of H10N. Duration is simply a little shorter.

Fig. 2
figure 2

Comparison of signals and spectrograms of the San Juan event (two top panels) and calibration event (two bottom panels), on stations H10N and H04S. The 5 min displayed signals are filtered between 1 and 60 Hz (2 min before onset time of direct H arrival, and 3 min after). The great similarity between the shapes and the durations of the signals observed at each station between the two events confirms the impulsive nature of the San Juan event. The differences in spectral levels (up to 8 dB, or a ratio of 2.5 in amplitudes below 20 Hz) demonstrate that San Juan event is more intense than calibration event. However, due to the slightly different propagation paths to the stations for both events and unknown depth of the San Juan event, the absolute strength of each source cannot be drawn from the simple comparison of the received levels. The presence of absorption lines in the spectrograms (spectrum scalloping), characteristic of interference between wavefields (generally associated to bubble pulses generated by underwater explosions) suggests the presence of echoes in all the signals, whose delays differ between both events

The shapes and durations of the signals from San Juan and calibration events are comparable at the two stations, and are both consistent with typical impulse-like generated events. Only amplitudes are significantly different, with higher values for San Juan.

  • At H10N, 6035 km, signals from both events last about 10 s with high signal-to-noise ratios. They are broadband, with energy above oceanic background noise between 1 and 80 Hz. The signal is well dispersed but a distinct peak is visible about 5 s after the onset, with a maximum peak overpressure level of 139 dB re 1 µPa for San Juan and 130 dB re 1 µPa for calibration event. Their spectral signatures exhibit differences: the dominant frequency of the San Juan signals lies between 7 and 10 Hz, while it lies between 15 and 20 Hz for the calibration event. However, this difference in frequency at peak spectral level can be a propagation effect and does not necessarily indicate differences in source characteristics. Moreover, the spectral absorption lines (peaks and valleys visible as horizontal lines in spectrograms) associated to the scalloping in the spectrogram are different between both events.

  • At H04S, 7760 km, the signals are more dispersed (about 30 s) than at H10N, with lower signal-to-noise ratio for calibration event than for the San Juan event. They are also broadband, but stronger attenuation of the higher frequencies above 40 Hz at H04S (not shown here) has been highlighted by Nielsen (Nielsen et al. 2019). He explains that this lack of high frequencies could be due to the interaction of small wavelengths with the Antarctic ice sheet, which extends until the southern latitudes of the propagation path at this season. Maximum peak overpressure level is 134 dB re 1 µPa for San Juan and 122 dB re 1 µPa for calibration event. Spectrum scalloping is also very pronounced, with comparable peaks and valleys in spectrum levels to those of H10N. Unlike H10N, the H04S signals exhibit a clear linear upsweep dispersion curve between 3 and 15 Hz for the San Juan event (see annotation in Fig. 2). For calibration event, this effect is less pronounced (even if we can guess it on the spectrograms), since the dominant frequency is higher. This feature corresponds to mode-1 of the surface duct formed by the cold polar waters, it is well restored by broadband full-wave modelling. It is also interesting to pinpoint that for San Juan, a precursor is possibly observed twenty minutes before the direct arrival (at 14:59 TU, not shown here). This observation could be related to propagation through Antarctic ice (Nielsen et al. 20182019).

Spectrum scalloping is a very common observation in underwater acoustics, and is a characteristic of interfering wavefields with the same frequency content (doublet signature). They are mostly associated with bubble pulses generated by underwater explosions, and are largely documented in the literature (Gitterman 2009; Hanson et al. 2007).

In the lower frequencies (below 20 Hz), the signals from the San Juan have a power up to 8 dB (ratio 2.5 in amplitude) higher than the signals from the calibration event. However, it is interesting to point out that for San Juan, the measured amplitude ratios at the two stations are not the same between the two events (5 dB for San Juan, 8 dB for calibration event). The differences in received levels are affected by the source strength, source depth and propagation conditions. As a consequence, the observed differences between San Juan and calibration events are most likely explained by the interaction of the acoustic wavefields with bathymetry along propagation to the stations. Within fifteen days, propagation conditions in SOFAR channel do not have significantly changed, while propagation paths are not exactly the same for the two events. In particular, Rio Grande Rise and the south of Sandwich Islands partially block more or less the propagation, depending of source position with great sensitivity. As a conclusion, the absolute strength of each source cannot be drawn from the simple comparison of the received levels. However, a comparison of the shapes and durations of the signals confirms the impulsive nature of the San Juan event.

1.4 Array Processing: DTK-PMCC Analysis

In this section, we present array processing results for the San Juan event direct arrivals at H10N and H04S, obtained with DTK-PMCC analysis (Dase TooKit Progressive Multi-Channel Correlation method (Cansi 1995)). For T- and H-waves (T-waves are seismic waves which convert to hydroacoustic waves at ground-ocean interfaces while H-waves are in-water generated hydroacoustic waves) which propagate along topographically unblocked geodesic paths, the three triplet channels generally show good signal correlation over sliding windows and various frequency bands. Even if the PMCC detector was originally developed for processing data from seismic arrays (Okal and Cansi 1998), it was later also successfully applied to data from infrasound arrays (Blanc et al. 1103) and hydroacoustic triplets of hydrophones. For example, Guilbert (Guilbert et al. 2005) wrote a paper in which he proposed a rupture extent of the great 2004 Sumatra-Andaman earthquake based on PMCC array processing at H08S Diego Garcia station. Currently, DTK-PMCC is the operational detector used to process infrasound data at the International Data Center (Mialle et al. 2019). DTK-PMCC (command line detector) and DTK-GPMCC software (Graphical User Interface which allows to configure, run the detector and interactively browse the results) is developed by the French National Data Center (CEA/DASE) and is made available by CTBTO to State Members through the NIAB (NDC-In-A-Box) project (Becker et al. 2015). Hydroacoustic support has been added in 2018 with automatic configuration adapted to IMS ~2 km-aperture hydrophone arrays. All results presented in this section can be reproduced by any CTBTO principal user (Member State user which can access IMS raw waveform data, IDC data bulletins, and the NIAB virtual machine).

For the San Juan event, ten frequency bands between 1 and 40 Hz are processed. The screenshots of DTK-GPMCC software are shown in Fig. 3. The detections are represented in time-frequency space and in polar diagrams. They exhibit the following features:

  • The directions of arrival (back-azimuth) are very stable in time and frequency at the two stations (217.1 ± 0.05 at H10N, 223.6 ± 0.1 at H04S). Such a precision in the measurement of wavefront parameters is really specific to hydroacoustic detection when the signal-to-noise ratio is high. Indeed, in infrasound and seismic technologies, the distribution of PMCC pixels is systematically much more scattered (a pixel is a unitary time-frequency cell in which detection test is performed and wavefront parameters are estimated).

  • At both stations, the apparent velocity (or trace velocity) decreases over time (from 1.5–1.475 km/s at H10N, and from 1.485–1.460 km/s at H04S). This classical observation at hydroacoustic arrays can be interpreted as a propagation effect over great distances: the waves which propagate in the SOFAR channel by upward- and downward- turning refractions with higher incidence angles (and which therefore have greater trace velocities than the value of the sound speed at the station) travel over greater distances than the waves with low incidence angles. However, the waves that have higher propagation angles with respect to the horizontal travel through faster layers of the ocean. In this case, these faster layers compensate for the additional traveled distance so that higher incidence angles arrive before lower incidence angles. Speaking normal modes formalism, the earlier arrivals are higher order modes which in deep-water propagation have faster group velocities. This observation is typically the opposite of what is expected in infrasound technology for ground-to-ground propagation: the trace velocity increases over time because the sound speeds of the fast atmospheric layers (generally the stratopause, around 50 km altitude) are not fast enough to compensate for the additional traveled distance (Ceranna et al. 2009).

  • For the two stations, the minimum trace velocity reached at the end of direct arrivals (1.475 km/s at H10N, 1.460 km/s at H04S) match with the sound speeds at hydrophone depths expected at this time of the year (these values can be extracted from Fig. 2). This confirms that the slowest arrivals arrive with incidence angle close to horizontal.

  • The back-azimuth of the signals recorded at H04S and H10N appear to change over time, especially at the beginning of the signal, when amplitudes are low (blue points on polar diagrams of Fig. 3). This observation could be due to local effects of reflection/refraction close to the station. In addition, it is interesting to note that unlike H10N on which the detected back-azimuth is very stable (within 0.05°) when signal amplitudes are high, it slightly changes at H04S over the duration of the highest signal amplitudes (within 0.2° over 25 s). This original observation could be associated to frequency-dependent horizontal refraction of waves along the colder polar water columns.

  • The dispersion curve observed on H04S spectrograms (Fig. 2) is also visible at the detection level (see Fig. 3), especially on trace velocity and correlation coefficient parameters.

Fig. 3
figure 3

Interactive PMCC analysis carried out with DTK-GPMCC software. The configuration settings have been adapted to the frequencies of hydroacoustic waves (here ten frequency bands between 1 and 40 Hz) and to the ~2 km aperture of the IMS hydrophone stations. At the top, the results are those obtained at H10N (10 s of coherent signal), below at H04S (30 s of coherent signal). Above the signals, from top to bottom, the back-azimuth, trace velocity and correlation coefficient parameters are represented in color, in time-frequency space. On the right, the same PMCC pixels are plotted in a polar diagram, in which the radius represents the trace velocity and the color is time. We clearly see that the sensitivity of this analysis allows to resolve with great precision the coherent hydroacoustic wavefield in time-frequency space, by efficiently constraining the wavefront parameters and highlighting propagation effects at detection level (decrease of trace velocity over time, dispersion effect at H04S)

The precision of the obtained results demonstrate the efficiency of the PMCC method to characterize the hydroacoustic wavefield in time-frequency space with a good resolution.

From the arrival times and the back-azimuths estimated from array processing, a two stations location can be calculated (see Sect. 1.5).

1.5 Sources of Errors: Limits Associated to The Use of Direct Arrivals

For the estimation of the uncertainties associated to the location process (generally represented as 95% confidence ellipse), measurement and modelling errors must be correctly estimated. For San Juan and calibration events, we identify the main sources of these errors as the followings:

  • Errors associated to the model

    1. o

      For each arrival time which is picked, a velocity model is associated to it. In French National Data Center operational hydroacoustic automatic pipeline (recently put in place), a spatial- and time-independent averaged velocity model of 1.482 km/s is used, with a standard deviation of 10 m/s. This sound speed provides generally satisfactory results for mid-latitudes propagations paths and for the precision expected from automatic processing. However, range- and depth-dependent oceanic sound speed may have a significant impact on averaged sound speed along propagation paths to each station, and so on location results. For San Juan and calibration events, this velocity model is particularly not suitable for the polar path to H04S. Propagation modelling is thus necessary to calculate more realistic velocity models. As the low-frequency propagation is only effected by slowly varying temperature structure, tabulated seasonal ocean atlases are planned to be used in a near future.

    2. o

      On triplets of hydrophones, each arrival also has a back-azimuth estimated from array processing (see previous section). This measurement must be corrected by an eventual azimuth deviation model. This deviation may be caused by horizontal refraction of waves along warmer or colder water columns: when propagation occurs in a cooling medium, waves refract in the direction of lower sound speed waters (they are like “attracted” to colder water). The propagation towards H04S is typically in this configuration with a path which passes tangentially near Antarctica and a SOFAR channel which cools along the first half of the propagation path (see panel (c) of Fig. 1). It is also interesting to point out that the azimuth deviations of acoustic waves in hydroacoustic and infrasound are not caused by the same physical mechanism. In infrasound, the deviations are not induced by a refraction process, but by advection: transverse atmospheric winds translate horizontally the wavefront. Especially during stratospheric winter, winds at 50 km altitude often exceed 100 m/s, causing azimuth deviations which can reach 15°. In hydroacoustics, they are induced by refraction caused by horizontal temperature gradients which rotate the wavefront. The maximum expected azimuth deviation from horizontal refraction is estimated to 0.5° (Li and Gavrilov 2009). In our study, azimuth deviations have not been modelled; standard deviation of errors is set to 0.5° consequently.

  • Errors associated to the measure

    1. o

      When impulse sources of short duration generate emerging signals of several tens of seconds because of propagation effects (dispersion in our case), time picking can be tricky. For example, it can be done at the beginning of the emerging signal (but this onset time is sensitive to the background noise level), at maximum amplitude, in the middle of the signal. Actually, the rule concerning the way to pick is that it has to be “feature based”. This approach is also described in Dall Osto’s paper (D. R. Dall’Osto 2104). In other words, common features between recorded and modelled signals have to be identified, so that picks and extraction of averaged sound speeds have to be done accordingly. This rule is very different from the way the onset times of seismic arrivals are systematically picked (but it actually corresponds to the way the tabulated travel time tables were built). Synthetics waveforms are calculated from 2-D TDPE (time‐domain parabolic equation) full-wave modelling, using a band limited impulse ricker source time function (with a dominant frequency of 7 Hz), at ARA San Juan wreckage position. The good match between the observations and the simulations allows to identify the onset times as the common extraction features. As a consequence, picks of the H-waves are done at onset times, and associated velocity models are calculated from the travel times of the onset times picked on the modelled waveforms. In this method, it is important to assume that the shape of the modelled waveforms do not change with source characteristics (controlled here by its dominant frequency, between 5 and 15 Hz). Details about full-wave modelling and comparisons between observed and synthetics are presented in next section. If location process is based on group velocities of specific modes, error depends on the precision in “feature picking” of the mode arrival time within the recorded signals (like the distinct arrival at H10N or dispersion at H04S). A remaining 5 s standard deviation of the error made on this time measurement is used.

    2. o

      A standard deviation of the error made on back-azimuth measurement also has to be taken into account. Part of this error (common to all waveform technologies) is caused by waveform sampling which discretizes the estimation of the wavefront parameters. Indeed, time delays for all couple of sensors are computed from time positions of the maximum of the intercorrelation functions; they are so multiples of the sampling period. In hydroacoustics, the sampling frequency of hydrophone data channels is high (250 Hz), this error is therefore negligible (less than 0.05°). The second part of this error is more difficult to estimate and comes from the fact that the relative positions of the array elements can slightly move. Indeed, since the lower parts of the hydrophones are connected to the seabed by cables, and the upper parts to a buoy, they can move slightly under the effect of the underwater currents and the internal tides. The azimuth measurement error is related to the uncertainty on the relative position of the three array elements. Assuming that they move 50 m maximum, the standard deviation of the error made on back-azimuth related to the position of the array elements is estimated to be 0.3° (Li and Gavrilov 2009), which remains a small error.

1.6 Preliminary Location

The cumulative errors relative to pick time and back-azimuth parameters, related to both model and measurement, have been identified and are fairly small (10 m/s for velocity models and 0.3 + 0.5 = 0.8° in back-azimuth). However, the only two detecting stations added to the very large propagation distances (6000 and 8000 km) make the problem poorly constrained. Consequently, the results remain very sensitive to these errors. To illustrate that point applied to the geometry of our problem, an error of 0.5° in back-azimuth results in an uncertainty of 60 km on event spatial location. An error of 3 m/s on velocity models results (or 3 s in feature picking) results in an uncertainty of 5 km on event spatial location. That is why the sum of these errors leads to rather large location uncertainty ellipses using only the measurements made on direct waves at the two stations. Indeed, the large uncertainty ellipse elongated in the east-west direction is a consequence of the distant stations and the relative arrangement between HA10 and HA04.

A preliminary location has been calculated from the maximum number available measures: we have used eight arrival times (three on H10N, two on H10S, three on H04S) and two back-azimuths measurements (one on H10N, one on H04S, none on H10S since one array element was down). This classic location procedure consists in minimizing the weighted residuals in time and in back-azimuths, as done for seismic and infrasound technologies, but with parameters adapted to hydroacoustics. 2-D probabilistic density functions are estimated on a 2-D grid centered on the last know position of ARA San Juan, where minimum provides the best location (Vergoz et al. 2019).

The results are presented in Fig. 4. The 95% confidence ellipse covers an area of almost 13,000 km2 (east-west oriented major axis is 160 km, north-south oriented minor axis is 100 km). At the western side of the ellipse, the seafloor is less than 100 m deep while at the eastern side, it is slightly more than 3000 m. Considering the poorly constrained problem, such results can be considered quite satisfactory. Even if the use of back-azimuths partially decouples the location and origin time of the event, the uncertainties remain large and the seafloors inside confidence ellipse cover shallow to relatively large depths, which is problematic especially if such data had to be used for a search mission.

Fig. 4
figure 4

Preliminary location of ARA San Juan obtained from ten measurements made on direct waves (eight arrival times, two back-azimuths) at stations H10N, H10S and H04S. Despite fairly small and well-identified sources of errors attached to the measurements and the models, the poorly constrained problem leads to a relatively large location ellipse (13,000 km2, 160 km along the major axis and 100 km along the small axis). The calculated location is in orange and annotated “NDC”. The REB location calculated by the IDC from the same set of stations is in yellow (it has been extracted from November 28, 2017 Reviewed Event Bulletin). For information, the location of the calibration event is indicated by the green arrow. The map on the left is a top view, and the map on the right is a three-dimensional view in which 95% confidence ellipse has been projected on the seabed

In order to improve this location, another approach using a subset of secondary arrivals was carried out. It is presented in the following section.

2 Refined Characterization of ARA San Juan Source

In order to refine the location and better understand the recorded signals from San Juan and calibration events, basic propagation modelling was first performed (Sect. 2.1). Then, a detailed analysis of the PMCC detections was extended to longer time periods than those studied in Sect. 1.3. Thanks to new revealed late detections, thirteen secondary arrivals were identified. They are associated to reflections and refractions of waves from well-identified bathymetric structures (Sect. 2.2). The use of only a well-chosen small subset of these late arrivals leads to a much better constrained location (Sect. 2.3). Finally, a detailed cepstral analysis of direct and secondary arrivals highlights different behavior for both events. Based on the interpretation of the position of the first cepstral peak of each arrival, we propose a double source mechanism for the San Juan event (Sect. 2.4).

2.1 Propagation Modelling

Basic two dimensional Parabolic Equation (2-D PE) RAM (Collins 1993) propagation simulations have been performed for the H10N and H04S stations, at different frequencies. This code, widely used for long range propagation, takes into account full range- and depth-dependent environment extracted from the Generalized Digital Environmental Model (GDEM3.0) climatology oceanographic databases (Rosmond et al. 2002) and SRTM30_PLUS bathymetry (Becker et al. 2009). In the RAM implementation used in this study, absorption by sea-water is not included, even if its contribution to the total loss could be significant for the considered propagation distances. For example Ainslie (Ainslie 2010) explained that for frequencies below 100 Hz, measurements of absorption in sea-water provide an attenuation in the range 0.0002–0.004 dB/km. Over a propagation range of 8000 km, this translates to a loss of 1.6–32 dB. In Fig. 5, the obtained vertical transmission loss maps and horizontal cross sections at 800 m for H10N and 500 m H04S (corresponding to station depths) are plotted for the San Juan event, at 7 Hz (dominant frequency of San Juan signals). Coordinates of wreckage (see introduction) and an arbitrary source depth of 500 m have been used. Even if source depth is unknown, it is important to note that coupling of acoustic energy into the SOFAR channel is not sensitive to the source depths larger than 300 m in this area.

Fig. 5
figure 5

Transmission losses at 7 Hz along paths to H10N (top) and H04S (bottom). In top panels of each station, range- and depth-dependent vertical transmission losses are represented in color (colorbars are adjusted for each path) for sound speed profiles shown in Fig. 1. Bottom panels of each station are horizontal cross sections at a depth of 800 m for H10N and 500 m for H04S (corresponding to station depths). In hydroacoustic, this type of representation is very useful to visualize the energy losses from bathymetry interaction, at a given frequency. On path to H10N, acoustic energy is scattered at the Rio Grande Rise (signal amplitudes are divided by 10, corresponding to a transmission loss drop of 20 dB), couples back into the SOFAR channel, and finally propagates with low loss until the station. Propagation effects associated with the range- and depth-dependent variations of the SOFAR channel can be clearly seen on H10N with changing upward- and downward-turning refractions depths. On path to H04S, series of upward-refractions and sea surface reflections are typical of low latitude polar propagation

On path to H10N, we clearly observe the scattering of acoustic energy generated by the Rio Grande Rise, causing an attenuation of 20 dB at 7 Hz (signal amplitude divided by 10). This value of attenuation is very sensitive to the precise shape and depth of this large bathymetric structure. Estimated transmission losses are therefore very sensitive to the precision of the bathymetric profiles and the exact wave paths through it. 3-D effects (Kevin and Campbell 2016) deserves to be taken into account, but are not considered in this paper. A sensitivity study would be necessary to link the absolute amplitudes recorded to H10N and the acoustic energy of the San Juan event. With the models used, the predicted attenuation at H10N is 138 dB rel. amplitude at 1 m.

On path to H04S, only the South American continental shelf slightly scatters acoustic energy after 500 km of propagation. Then, acoustic waves remain efficiently ducted inside the SOFAR channel until the station, without being hindered. Transmission losses are then dominated by cylindrical geometric expansion (~1/r). With the models used, the predicted attenuation is 132 dB rel. amplitude at 1 m, at 7 Hz. Even if H04S is 2000 km further away than H10N, propagation modelling predicts lower amplitude at 7 Hz at H10N. The sound speed profiles and hydrophone depths (~500 m for H04S and ~800 m for H10N) are different at the two stations, which may impact the received levels. Moreover, scattering caused by Rio Grande Rise along H10N path and Sandwich Islands on H04S path, and bathymetry close to the stations can focus the acoustic field. All these reasons can explain why the received levels are higher at H04S than at H10N.

SOFAR channel warming is clearly visible along H10N path: at the end of the propagation, waves are ducted deeper than at the beginning. Towards H04S, the waves are ducted close to the sea surface via a series of upward-refractions and sea surface reflections (as expected for a polar path for which minimum temperature is close to the sea surface). The interaction with the sea surface and the seasonal Antarctic ice-sheet, which partially intersects the geodesic propagation path (http://www.natice.noaa.gov/products/daily_products.htm), would deserved to be taken into account as highlighted by Nielsen (Nielsen et al. 2019).

In a second step, time-domain full-waveform modelling was performed with 2-D TDPE (time‐domain parabolic equation, which is the time response synthesized by the inverse Fourier transform of the frequency-domain parabolic equation PE (Collins 1993)). Like with the RAM model, sea-water absorption in not taken into account. The source time function chosen for the calculation is a band limited impulse ricker function (with a dominant frequency of 7 Hz), at ARA San Juan wreckage position, 500 m deep and an origin time at 13:51:05 UTC (see Sect. 2.3). No bubble pulse is modelled. The oceanographic databases (sound speed, bathymetry, seabed properties) used are the same as for the previous RAM runs. Such long range simulations are very time consuming at high frequencies, that is why for H10N the synthetics signals were limited to [0–30 Hz] frequency band, 30 s time window, and for H04S to [0–20 Hz] frequency band, 50 s time window. The results of these simulations are shown in Fig. 6.

Fig. 6
figure 6

In top panel, comparison of signals recorded at H10N1 and H04S1 (in purple), and synthetics (in yellow). Normalized waveforms are filtered between 1 and 20 Hz, and a common time scale of one minute is used for both stations. Corresponding spectrograms are plotted in bottom panel in the same order. The signals and spectrograms are time-aligned on the onset times (chosen as “pick feature”) of H-waves for easier comparison. The shapes and durations of the signals are well explained by propagation modelling at the two stations. The spectral contents between 1 and 20 Hz are also well restored, especially the upsweep dispersion curve at H04S, between 3 and 15 Hz. The velocity model corresponding to modelled H onset time at H10N (1.476 km/s) is close to the measured one (1.477 km/s). However, at H04S, the modelled H onset time is significantly too fast (1.471 km/s modelled from GSDEM2.5 oceanographic databases, versus 1.463 km/s measured)

The obtained results are satisfactory: the shapes, the durations and the spectral contents of the signals between 1 and 20 Hz are well explained by propagation modelling. In particular, the dispersion curve at H04S is well restored, although a less linear upsweep than in the data. On the other hand, the simulated velocity model is significantly too fast (1.471 km/s at onset time) for this station, and does not correctly explain the observations (1.463 km/s at onset time). The shapes and the durations of the modelled signals are not sensitive to absolute values of the velocity models, but to the shape and variation of the SOFAR channel along the propagation paths. However, arrival times are very sensitive to these values. We observe here the limits of GDEM3.0 climatological oceanographic databases which apparently overestimate the temperature of the SOFAR channel in the southern latitudes, at this time of the year. More realistic operational oceanic models which integrate general circulation models and data assimilation, like HYbrid Coordinate Ocean Model (Chassignet et al. 2007) (https://www.hycom.org/ocean-prediction) or ECCO2 model (Menemenlis et al. 2008), should be used instead. However, these aspects were not addressed in the present study.

The simulated average sound speeds along propagations paths, calculated from the onset times (pick feature) of the synthetic signals, cannot be used to refine the location, because of its overestimation at H04S from used oceanic model.

2.2 Analysis of Secondary Arrivals

At H10N, when we extend array processing calculations to longer time windows than those used to characterize the direct arrivals of the San Juan event, we notice several low amplitudes detections within the fifteen minutes following the most energetic signal (see top panel in Fig. 7). We know that when large underwater earthquakes occur, P and S seismic body waves can convert to hydroacoustic T-waves at ground-ocean interfaces, even relatively far from the hypocenter. Coupling efficiently highly depends on earthquake size and depth, as well as seafloor depth and shape in the vicinity of the epicenter. Moderate and large underwater earthquakes occurring in subduction zones (typically as reverse and normal faulting) or ocean ridges (as normal and transform strike-slip faulting) can typically generate this kind of detection patterns. Using PMCC detector, Graeber (Graeber and Piserchia 2004) illustrated this point by mapping zones of T-wave excitation in the NE Indian Ocean, from moderate subduction earthquakes (M4.4 to M6.8) occurring along the Sumatra trench. In our case, the San Juan event not being an earthquake, those detections did not first catch our attention, especially since the detected back-azimuths were very fluctuating (alternating between south-west and west directions). But even if the source mechanism is different and ground-to-water coupling is not of interest in this study, T-waves reflection process on bathymetric structures remain the same.

Fig. 7
figure 7

Interpretation of secondary arrivals of the San Juan event. On H10N, the thirteen detections arriving within the fifteen minutes following the direct arrival are all reflections (or equivalent reflections in case of refraction) on more or less large bathymetric structures. The red lines are the direct geodesic paths to the stations. The yellow and purple lines are the thirteen reflection paths. Yellow points are the raw back-projected PMCC pixels used to identify all the reflectors. They are calculated from each pixel arrival time and back-azimuth, assuming a constant velocity model of 1.47 km/s and REB location. The challenge of the location refinement is to remove the offsets between the positions of the yellow dots and their associated reflectors. These reflectors are islands (3, 13, 14), seamounts (2, 4, 7, 12) and the South American continental slope (5, 6, 8, 9, 10, 11). Each reflector is indexed on the map and refers to a detection in PMCC diagram, in the top panel. On H04S, only a possible reflection on the southernmost part of the South American continental slope is found one minute after the direct arrival (reflector 2b). Actually, among the fourteen reflectors which have been identified, only two really constrain the location (reflectors 2 and 3, yellow paths). A refined location using these reflectors is calculated in Sect. 2.3

Similarly, we have extended array processing calculations to longer time periods for the calibration event. We noticed that the high amplitude direct arrival was followed by a comparable sequence of low amplitude detections to that from the San Juan (the detection sequence of calibration event is not shown in this paper). As the energy of the calibration event is lower than the San Juan event (see conclusions of Sect. 1.2) and the oceanic background noise higher, the secondary arrivals with the lowest signal-to-noise ratios for San Juan are missing in calibration detection sequence. Late arrivals are therefore less numerous. However, it is important to pinpoint that for the common late arrivals between both events, time delays and detected back-azimuths of matching detections are very close. This observation clearly confirms that, at H10N, all the detections within the fifteen minutes following the direct arrivals are related to the same events.

In order to identify their origin, all PMCC pixels belonging to late detections were back-projected onto a map, under the assumption that they are associated to specular reflections on bathymetric structures. Horizontal refraction process from bathymetry (Munk and Zachariasen 1991) is not considered here for simplicity, even if it also plays an important role to redirect wavefronts seaward through multiple bottom interactions as they typically propagate up a continental slope. The projection method is very basic and is not sensitive to the distinction between reflection and refraction processes. In case of refraction, an equivalent reflection point is mapped. From the approximate position and origin time found for the San Juan event (see preliminary location results in Sect. 1.5), a coupling point is searched for each PMCC pixel along the great circle path starting from the station and in the direction of the detected back-azimuth. If it exists, the chosen mapped point is that for which the propagation velocities associated to the source-reflector and reflector-station paths are compatible with hydroacoustic sound speeds (typically 1.48 km/s here). The results of the PMCC pixel back-projection at 1.48 km/s are shown in Fig. 7. It is important to understand that this technique does not need any bathymetric models as inputs: from a source known in space and time, a unitary detection (pixel) characterized by its arrival time and back-azimuth, and two velocity models (source-reflector, reflector-station), a point is found on the map. The obtained point is used to manually identify potential reflectors, given the uncertainties associated to measured back-azimuth and used sound speeds.

This pixel back-projection at H10N has allowed to identify unambiguously thirteen reflectors (or equivalent reflectors) from the low amplitude detections within the fifteen minutes following the direct arrival. In Fig. 7, these reflectors are indexed on the map and refer to the detection numbers above PMCC diagram (at the top of the figure). Found reflectors are also summarized in Table 1 with their names and geographical coordinates.

  • Path 1, in red, is the direct geodesic path.

  • Reflector 2, along first yellow path, is a small seamount located east of the Rio Grande Rise. Its name is not referenced in seamount databases. However, although small, this seamount is the only bathymetric structure in the back-azimuth of the corresponding detection which reaches shallow enough depths (500 m) to effectively reflect hydroacoustic waves. The small size of this reflector can explain the associated low frequency narrow band and short duration detection compared to other reflectors (see top panel of Fig. 7).

  • Reflector 3, along second yellow path, is the easternmost island of the Trindade and Martim Vaz islands, offshore Brazil.

  • Reflectors 4 and 7 are seamounts. Reflector 4 is Columbia Seamount, it reaches 60 m depth. Reflector 7 is Davis Seamount, it reaches 50 m deep. Although bigger and shallower than reflector 2, associated detections are very short and narrow band. Even if their identification is unquestionable, this observation remains to be explained.

  • Reflectors 5, 6, 8, 9, 10 and 11 correspond to the South American continental slope. As shown by Dall'Osto (D. R. Dall’Osto 2104), “refractor” would be a more adequate term than “reflector”, but found locations listed in Table 1 are considered as equivalent reflection points. On PMCC diagrams, only the detections associated to these reflectors exhibit the same propagation features than the one observed on the direct arrival (trace velocity decrease over time, see Fig. 3).

  • Reflector 12 is a shallow seabed called Black Rock, west of South Georgia Island.

  • Reflectors 13 and 14 correspond to two separate reflection zones, west of South Georgia.

On H04S, only a small narrow band detection arriving one minute after the main arrival could be associated to a reflection on the southernmost part of the South American continental slope (see reflector 2b, in magenta, Fig. 7). Since the signal-to-noise ratio is very low, an ambiguity remains, but the detected back-azimuth, arrival time and reflection angles are compatible with this hypothesis.

Table 1 List of identified reflectors (or equivalent reflector in case of refraction) for H10N and H04S stations

To conclude this part, it is important to emphasize the quality of the obtained detections: from one single execution of the PMCC detector between 1 and 40Hz, we first highlight detailed propagation effects of each arrival in time frequency space (decrease of trace velocity over time, dispersion effects on H04S at detection level). But we also reveal small narrow band detections with low signal-to-noise ratios (see in particular detections 4, 10, 11, 12 and 2b). The back-projection of the corresponding pixels confirms that all these detections made at the oceanic background noise level are real, since some reflectors have been identified for all of them. Moreover, it also shows that the estimated back-azimuths are very accurate: for each identified small size reflector, the corresponding back-azimuth is always less than 0.3° from its theoretical value. We clearly illustrate here the capability of the PMCC detector to efficiently characterize wavefront parameters from the triplets of hydrophones of the hydroacoustic component of the IMS network, even from low amplitude arrivals.

2.3 Refinement of the ARA San Juan Event Location Using Secondary Arrivals

The positions of the identified reflectors and the arrival times of the detections to which they are associated contain additional information about the San Juan event location. If one single arrival time at each hydrophone station is kept for direct and reflected arrivals, and one back-azimuth for direct arrivals only, the number of measures increases from four (one arrival time and one back-azimuth for each direct arrival at H10N and H04S) to eighteen (fourteen arrival times and one back-azimuth at H10N, two arrival times and one back-azimuth on H04S). Each used late arrival is equivalent to add a new virtual station. The detected back-azimuths of secondary arrivals are not counted but they are used to identify and locate accurately the different reflectors.

On the other hand, without sophisticated 3-D modelling from realistic bathymetric and ocean data, all the reflectors do not have the same power to constrain the location of the San Juan event with the same precision: even if the number of measurements is significantly greater than when taking into account direct arrivals only, it is important to understand that location results are very sensitive to the exact positions of the reflection points (geographical location and depth).

  • Reflector locations depend on the detected back-azimuths, and therefore on the associated measurement uncertainties, and velocity models. For example, for reflections/refractions which occur on the South American continental slope (5, 6, 8, 9, 10 and 11 in Fig. 7), a maximum measurement error of 0.8 ° on the back-azimuth (see why in Sect. 1.5) results in the offset of the position of the equivalent reflection points by several tens of kilometers to the north or to the south (these values depend on reflector-station distance and the reflection angle. The hydroacoustic propagation being relatively slow (close to 1.48 km/s), the corresponding travel times are modified by a few seconds until several tens of seconds only because of the uncertainty about reflector location.

  • Reflection points depend on the reflection depth, which is not precisely known (it requires accurate bathymetric and velocity models in the vicinity of each reflector). Considering once again the reflections on the South American continental slope, even if the slope is quite steep (see 3D view of Fig. 4), a reflection (or refraction) occurring at 500 m or at 1000 m deep also shifts the reflector locations by a few tens of kilometers to the east or to the west. As previously, the corresponding travel times would be changed by a few seconds, up to more than ten seconds.

  • Two values of sound speed per secondary arrival shall be taken into account: one between the source and the reflector, and another between the reflector and the station. For example, for reflections on South Georgia (12, 13 and 14 in Fig. 7), the sound speeds between the source and the reflectors are slower (cold ocean) than between the reflectors and the station (warming ocean). Adding a second velocity model per secondary arrival, with their associated uncertainties, also results in a change of the travel times by a few seconds.

Finally, the use of the fourteen secondary arrivals to constrain the San Juan location adds new uncertainties associated to the position of the reflection points and to the taking into account of one second velocity model per arrival. Without precise propagation modelling, the time residuals induced by these uncertainties can be significant, and are therefore not acceptable given the fineness of the desired location accuracy.

However, among these reflectors, two have relatively unambiguous well-known positions: reflector 2 (nameless seamount east of the Rio Grande Rise, see its coordinates in Table 1) and reflector 3 (small easternmost Trindade and Martin Vaz Island, offshore Brazil). The corresponding paths are plotted in yellow in Fig. 7. Unlike the other reflectors, only these two ones have the following features:

  • They are very small and their slopes are very steep (almost vertical cliffs). As a consequence, the uncertainty on the reflection point is low (of the order of one kilometer) and the difference between reflection and refraction process is insignificant.

  • The associated propagation paths are very close to the direct paths (in red in Fig. 7). We can thus consider that the velocity models for the direct arrival are the same than those of secondary arrivals.

If we take into account these only two first secondary arrivals at H10N, the issues mentioned above about the additional uncertainties associated to late arrivals disappear. That is why the relocation proposed in this section finally uses only the two direct arrivals at H10N and H04S, and those associated with the two first reflections at H10N. This way, even if the number of measurements remains limited, the associated uncertainties are better controlled so that the location is better constrained.

The last remaining uncertainties are the velocity models to use for the two stations. Given the large propagation distances, the location accuracy is very sensitive to these models, as already mentioned in Sect. 1.5. We have also experienced that the velocity models extracted from climatological oceanographic databases are not accurate enough in our case (see full waveform modelling at H04S in Sect. 2.1). That is why we prefer to search the velocity models in addition to the location parameters.

The five searched parameters are thus the following:

  • Latitude, longitude and origin time of the San Juan event

  • Velocity model for H10N

  • Velocity model for H04S

Event depth is not searched, since it is too poorly constrained.

Four measures are selected:

  • Arrival time of direct arrival at H10N

  • Arrival time associated to reflectors 2 and 3 at H10N

  • Arrival time of direct arrival at H04S

Back-azimuths of direct arrivals are not used, because of their too high associated uncertainties (0.8°, see Sects. 1.4, 1.5) which would degrade the location accuracy.

The problem is thus under-determined: we have more parameters to estimate than measures. In that configuration, the San Juan event location cannot be constrained from the subset of those four arrival times only. As a consequence, the original idea to overcome that limitation is to jointly use the measures made for San Juan and calibration events. The six parameters to estimate become the following:

  • Latitude, longitude and origin time of the San Juan event

  • Velocity model for H10N (assumed to be the same for both San Juan and calibration events)

  • Velocity model for H04S (assumed to be the same for both San Juan and calibration events)

  • Origin time of calibration event (the spatial location is well-known, but the provided time is approximate, see the introduction).

The seven measures available are the followings:

  • Arrival time of direct arrival at H10N for San Juan

  • Arrival time of direct arrival at H10N for the calibration event

  • Arrival time of direct arrival at H04S for San Juan

  • Arrival time of direct arrival at H04S for the calibration event

  • Arrival times associated to paths 2 and 3 at H10N for San Juan

  • Arrival time corresponding to path 3 at H10N for calibration event only. Indeed, the low amplitude detection associated to reflector 2 is not observed for the calibration event.

In this configuration, there are seven measures for six parameters to be determined: the problem becomes over-determined and the location can thus be calculated.

The problem consists in the calculation of simple time residuals. A heuristic approach (brute force grid search) is naturally preferred: all searched parameters are meshed uniformly between minimum and maximum values, with very fine steps (1 s for origin times, 100 m for geographical parameters, 1 m/s for velocity models). Fifty-eight millions of calculations of time residuals have been carried out for each grid node, and the selected solutions are those for which these residuals are less than four seconds. Obtained results are shown in Fig. 8.

Fig. 8
figure 8

Relocation (green ellipse) of San Juan using jointly the direct and secondary arrivals of both San Juan and calibration events. Only a well-chosen subset of secondary arrivals has been selected. The orange ellipse corresponds to the preliminary location obtained from direct arrivals only (see Sect. 1.5). Thanks to the method used for the relocation process, the location uncertainties are reduced and the accuracy is significantly improved: the center of the green ellipse is located 3.5 km from the found wreckage of the ARA San Juan (black star). In the 3D view, the 95% confidence ellipses are projected onto the seabed. In the right panel, the a posteriori distributions of the four non-geographical searched parameters are plotted. Such statistics are calculated from the 40351 values of each parameter which provide RMS time residuals less than 4 s. It is interesting to pinpoint that the geometry and the data of the problem make the results not sensitive to the absolute values of velocity model at H10N, unlike H04S. However, it remains very sensitive to the time delays between direct and used secondary arrivals

Compared to the preliminary location obtained using direct arrivals only (see Sect. 1.5), results are significantly improved. The area of the 95% confidence ellipse is reduced by a factor of 100 (decreasing from 13,000–1400 km2). Its major axis is 57 km and is roughly oriented east-west, minor axis is 30 km and is roughly oriented north-south). On the westernmost side of the ellipse, the depth of the seabed is less than 100 m while at the easternmost part, it is around 1000 m. The center of the green ellipse (45.97°S, 59.81°W) is located 3.5 km from the found ARA San Juan wreckage. Such a result is remarkable given the only two detecting stations and their remote positions (about 6000 and 8000 km). Even if the number of measures selected for the relocation process remains limited, this result reflects that with a well-adapted method and a selection of relevant arrivals, location can be efficiently constrained using hydroacoustic data only.

In Fig. 8, the a posteriori distributions of the four non-geographical grid searched parameters are plotted. They correspond to the distributions of the parameters which provide RMS time residuals less than four seconds (40351 values). The a posteriori Gaussian-like distributions of latitude and longitude of San Juan are represented as the 95% (2σ) confidence green ellipse. Interestingly, the shapes of these distributions are different from one searched parameter to another, and illustrate well how they are constrained with this approach. The best (optimum) values of the parameters are extracted from the a posteriori distribution. For Gaussian-like distributions (all searched parameters except H10N velocity model), uncertainties are given at 95% (2σ):

  • Location of ARA San Juan: 45.97 °S, 59.81 °W ± 29 km (direction WNW-ESE) and ± 15 km (direction SSW-NNE);

  • Origin time of ARA San Jan: 2017/11/15 at 13:51:05 ± 9 s;

  • Origin time of calibration event: 2017/12/01 at 20:04:30 ± 14 s. This estimation is compatible with the helicopter drop time of the calibration grenade, provided by the Argentine Navy (2017/12/01 at 20:04:00);

  • Velocity model at H04S: 1.460 km/s ± 4 m/s. This value is compatible with the cold SOFAR channel along propagation path at this time of the year;

  • Velocity model at H10N is not constrained, interestingly: the geometry and the data of the problem make the results insensitive to the absolute value of the velocity model at H10N, at least within the acceptable searched bounds (i.e., between 1.475 and 1.485  km/s). However, location results are extremely sensitive to the time delays between the direct and the secondary arrivals.

To conclude this section, it is noteworthy to mention that using a completely different approach but also hydroacoustic data only, Dall'Osto (D. R. Dall’Osto 2104) succeeded in locating San Juan wreckage with an impressive precision from direct arrivals and arrival 5 only (highest amplitude secondary arrival that we rejected in our study). Using a sophisticated multi-level 3-D approach from precise ocean temperature models and bathymetry, Dall’Osto was able to simulate the bathymetric refraction from the South American continental slope associated to arrival 5, and efficiently triangulate source location. For the time being, French NDC does not have the capacity to conduct such in depths calculations. But our alternative approach based on detailed data processing, simple propagation considerations and the use of only a small subset of well-chosen secondary arrivals provide also nice results, even if final uncertainties of location parameters are a little larger than the one calculated by Dall’Osto. Nielsen et al. (Nielsen et al. 2020) and Heyburn et al. (Heyburn et al. 2020) propose a refined location using complementary data from the Argentine seismic station TRQA. The obtained locations are between 10 and 20 km from localized wreckage.

2.4 Interpretation of First Cepstral Peak: ARA San Juan Double Source Scenario

Ultimately, a cepstral analysis (Childers et al. 1977) was performed to try to characterize San Juan source in comparison to well-known calibration explosive source. This signal processing technique is widely used in speech processing to detection and remove echoes. In hydroacoustics, it is used in routine to detect and discriminate underwater explosions. In simple terms, one can think of cepstrum as a technique to detect echoes in time series.

An explosion taking place in an underwater environment has characteristics linked to the interactions between the energy and the released gases during detonation in water. In a study analyzing the explosion of the Kursk submarine, Sebe (Sèbe et al. 2005) plotted the temporal evolution of the pressure field during an underwater explosion (see Fig. 9). The effect of the explosion is to generate a pressure shock wave. This pressure peak is immediately followed by the formation of a gas bubble subjected to an expansion/contraction movement, which periods vary over time under the effect of the strong ambient static pressure. This pulsation period (or bubble period) can be expressed by the empirical formula by Willis (Chapman 1985) which relates the period, the explosion yield and the depth of the explosion:

Fig. 9
figure 9

Temporal evolution of the pressure field and the radius of the gas bubble from an underwater explosion (Sèbe et al. 2005)

$${\rm T} = 2.1{\frac{{w^{1/3}}}{(z + 10)^{{5/6}}}}$$

where T is the bubble period (in seconds), w the yield (in kg eq. TNT) and z the depth (in meters).

In theory, it is possible to observe several bubble pulses; but due to the large observation distances at which SNR of the echoes can be too low, it is only the first oscillation which is best observed on the signals most of the time. Moreover, if the source is too shallow, the bubble “vents” through the sea surface and does not have time to pulse. For the 108 kg eq. TNT calibration grenade event, three bubble pulses are clearly observed (see Fig. 10) versus only one pulse-like feature for the San Juan.

Fig. 10
figure 10

Cepstral analysis applied to the signals of the calibration event on H10N. In the bottom panel, the black curve is the cepstrum calculated on the direct arrival (shown as yellow rectangle in the top panel). The middle panel is a cepstrogram which describes the evolution of the signal cepstrum as a function of time, which amplitude is represented in color. For a given time, each vertical section is a cepstrum calculated on a three seconds time window. Times of cepstrum peaks correspond to the time delays between the direct arrival and its echoes. These delays can be read on the X-axis in the bottom panel and on the Y-axis in the middle panel. Three echoes associated with the three first bubble pulses are visible at t0 (454 ms), t1 (861 ms) and t2 (1.243 s). These values summarized in the table on the right are in agreement with the theory for an explosion of 108 kg at 33 m deep (and not planned 38 m). The peaks at 2t0, 3t0 and t0 + t1 are well-know artefacts of the used cepstral technique

On the signal of a pressure wave generated by an underwater explosion (here a hydroacoustic H-wave), an interference is created between the initial pressure peak and the oscillation movements of the bubble which goes up towards the sea surface. This interference takes the form of a modulation of the spectrum of the signal, which corresponds to the series of harmonics:

$$f_{{{\text{nr}}}} = nf_{{1r}} ,\ n = 2,3,...$$

where f1r is the frequency of the fundamental mode.

For the calibration grenade of December 1, 2017, the harmonics linked to the bubble pulsations are observed on the recorded signals up to a very high order (at least order 16), due to broadband and high frequency content. These harmonics are clearly visible in Fig. 2 and annotated as “spectrum scalloping”.

These spectral modulations can be characterized by signal processing techniques based on the cepstrum (Childers et al. 1977), and allow picking of delayed signals (or “echoes”). The same way as a spectrogram describes the evolution of a signal spectrum as a function time on sliding overlapping windows, a cepstrogram describes the evolution of a cepstrum of a signal as a function of time.

In order to validate the method, these cepstral techniques were first applied to the calibration event signals (see Fig. 10). Second, they were applied to the San Juan signals (see Fig. 11).

Fig. 11
figure 11

Cepstral analysis applied to San Juan signals. See Fig. 10 for the description of the different panels. Despite the higher amplitudes and the better signal-to-noise ratio compared to calibration event signals, only one single cepstral peak is clearly visible at t0 = 345 ms (versus three for the calibration grenade, see Fig. 10). The peak at t1 = 701 ms remains questionable. This observation suggests a different source mechanism between the two events

The cepstrum calculated on the direct arrival has a clearly distinguishable first peak at t0 = 0.454 s. This value corresponds to the time delay between the signal generated by the first pulsation of the bubble and the shock wave generated by the explosion. This value is in agreement with the theoretical values given by the Willis formula, for an explosion occurring at 33 m deep and a yield of 108 kg TNT. Assuming that the provided yield and the conversion to TNT equivalent are exact, the planned 38 m detonation depth has been raised to 33 m. This peak is also clearly visible on the cepstrogram of the signals associated with reflectors 2, 3, 5 and 8 (see Fig. 7). The following positive peaks at t1 and t2 correspond respectively to the second and third pulsation of the bubble. Even if these peaks are attenuated compared to the first one, they are clearly visible and in agreement with the theory, to within a few milliseconds (the theoretical and measured values are summarized in the table on the right of Fig. 11). The other peaks in the cepstrum at 2t0, 3t0 and t0+t1 can also be explained and are well-known artifacts of the used cepstral technique (see Appendix of Kemerait et al.’s paper (Kemerait and Childers 1972)).

This cepstral analysis carried out on the signals from the calibration event allows to validate the method since each significant peak is understood. As a consequence, the same analysis has been performed on the San Juan signals. The results are shown in Fig. 11.

Despite the higher amplitudes and the better signal-to-noise ratio than for the calibration event, the different cepstral peaks are less pronounced and less numerous. A first peak at t0 = 345 ms is visible on direct arrival and arrivals from reflectors 2, 3 and 5. The negative peak at 2t0 is weak and the peak at t1 = 701 ms is questionable. This observation is very interesting since three bubble pulsations are visible for the calibration event, although less intense.

In order to better understand the differences between San Juan and calibration events, the cepstrograms were extended to longer time periods. The cepstral peaks of direct and secondary arrivals are compared for the two stations H10N and H04S. The results are shown in Fig. 12.

Fig. 12
figure 12

Comparison of the cepstrograms of H04S (left) and H10N (right), for calibration event (top) and the San Juan event (bottom). First cepstral peaks are the most pronounced horizontal red lines. They are clearly visible for the direct arrivals (1 and 1b), and for the reflectors 5, 8, 12 and 14 at H10N. Corresponding delays are annotated on the cepstrograms. Compared to Figs. 10, 11, a zoom on the delay Y-axis is carried out between 300 and 500 ms. For the calibration event, all the times of the first cepstral peaks have an identical value of 454 ms within 1 ms, whatever station and whatever arrival. For San Juan, the delays vary from 314 ms for the reflectors 12 and 14, to 354 ms for the reflector 8 (northernmost reflection/refraction on South American continental slope). This unique observation is the evidence of the occurrence of two events, slightly separated in space and time

For the calibration event, all the times of the first cepstral peaks have a strict identical value of 454 ± 1 ms, whatever the station and whatever the arrival. This is an expected result for an isotropic explosive source. In the case of the San Juan event, we observe an anisotropic behavior of measured delays (see Fig. 13): the delays vary regularly from 314 ms for the southernmost reflector 14 (South Georgia) to 354 ms for the northernmost reflector 8. Unlike the calibration event, this observation is not consistent with the isotropy that is typical of an explosive source.

Fig. 13
figure 13

Regular variation of the delays measured from first cepstral peaks with emission angle, for the San Juan event. This anisotropic behavior is not consistent with an explosive source

We interpret these differences as a source effect: such delay variations could be generated by two events, slightly separated in time in space. An analogy can made with seismology when large earthquakes occur, for which source duration and rupture spatial extension are big enough for rupture kinematics to be observed at remote stations (Ammon et al. 2004). The seismic waves recorded at stations located in the direction of the rupture propagation have a shorter apparent duration than the seismic waves recorded at stations located in opposite direction. Such a directivity effect is better observed on SH body waves and surface waves, which propagate at velocities closer to the rupture velocity than P waves. In the case of the San Juan event, source effect would not be generated by the propagation of a rupture but by the occurrence of two non-collocated events.

The presence of only one single cepstral peak let think that there were exactly two non-pulsating events. Under this assumption, observed delays are actually the sum of two delays: (1) the delay separating in time the two impulsive events and (2) the delay associated to the difference of propagation time that hydroacoustic waves generated by each event need to reach the stations. This second station-dependent delay depends on the spatial distance separating the two events and the submarine’s route.

To test this hypothesis, we propose a simple double impulsive source model. The submarine has been schematized as an ellipsoid with a length of 66 m [(major axis) (https://en.wikipedia.org/wiki/ARA_San_Juan_(S-42)]. If we consider that the submarine is separated in two compartments by its middle, and we assume the center of each compartment to be the location of one source, distance between the two events is 33 m (see Fig. 14). The theoretical delays are expressed as follows:

Fig. 14
figure 14

Double source model (schema to the left) and delays from best obtained solution (polar plot to the right). In that model, two sources separated by δt ms in time and R meters in space occur at the center of assumed two compartments (red stars) of the submarine cruising with route θ0. For each arrival, delay is the sum of δt and the difference of propagation time corresponding to δR distance. On the polar plot, the radius is this delay and the angle relative to north is the emission angle. Measured delays at H10N are the red points (8 measures) and measured delay at H04S on direct arrival is the green point (see Table 2). The blue flatten circle curve is the best solution found for δt = 332 ms and θ0 = 356°, which minimizes the deviation between the modelled and the measured delays. In other words, we estimate that when the event occurred, the ARA San Juan was cruising towards the north and that the source associated with the northern compartment of the submarine emitted an impulse 332 ms before the source associated with the southern compartment

$${\delta t}_{arr,th} = {\delta t} +{\frac{R}{C}} \cos {(\theta - \theta _{0} )}$$

where δt is the delay separating the two events in time, θ is the emission angle relative to the north, θ0 is the route of the submarine relative to the north, R is the spatial distance separating the two impulsive sources and C the sound speed at event depth. δtarr,th is the theoretical delay associated to arrival arr associated to emission angle θ. These different parameters are represented Fig. 14.

Searched parameters are δt and θ0. In order to constrain them as well as possible, additional measures of delays is necessary with a better sample of the emission angles. Fine cepstral analysis on the thirteen reflected arrivals of H10N was performed. Unitary cepstrum calculated on 3 s time windows with an overlap of 50% at each array element were stacked for each arrival. Nine measures of delays were finally obtained from first cepstral peaks, and are summarized in Table 2.

Table 2 Table of delays measured from first cepstral peak of each arrival, as a function of emission angle

The maximum delay of 354 ms is observed for arrival 8 towards NNE, and the minimum delay of 314 ms is observed for arrival 14 towards ESE. Given these measured values, a simple grid search is performed on δt from 310–360 ms every 1 ms) and θ0 parameters (from 0°–360° every 1°). To limit the number to parameters to determine, the distance between the two events is fixed at 33 m and the sound speed at event depth is fixed at 1.470 km/s. Note that the problem is not sensitive to the value of the sound speed in the 1.460–1.475 km/s range, corresponding to the surface 1000 m range of depth. The submarine is also assumed to be horizontal when source events occurred, the eventual dip is not taken into account.

The final chosen solution is the one which minimizes the deviation between the modelled and the measured delays, in the least squares sense. The best misfit value is obtained for δt = 332 ms and θ0 = 356°. For this best solution, modelled delays look like a flatten circle and are plotted as a blue curve in Fig. 14.

This simple double source model allows to explain the observed delays with a good agreement. If we assume that the distance between the two impulsive sources is 33 m, we estimate that the ARA San Juan was cruising with a route of 356° ± 10° and that the north source emitted an impulse 332 ± 5 ms before the south source. Uncertainties are estimated from measured delays uncertainties, a distance between the sources varying between 25 and 40 m, and a maximum submarine dip of 20°. This result is obtained from two hydroacoustic stations only, located at 6035 and 7760 km from the event. These conclusions come from the observation that the first cepstral peaks do not match with the first oscillation of a bubble created by an underwater explosion, but by the occurrence of two distinct events slightly separated in space and time.

3 Conclusion

On November 15, 2017, the hydroacoustic event related to the disappearance of the Argentine military submarine ARA San Juan was detected by two hydroacoustic stations of the IMS network: HA10 at the Ascension Islands (mid-Atlantic Ocean, 6000 km from wreckage) and H04S at Crozet Island (Indian Ocean, 8000 km from wreckage). In an attempt to interpret the recorded signals, a calibration shot was conducted by the Argentine Navy two weeks later, on December 1, 2017, near the last known position of the submarine. The joint use of the signals generated by both San Juan and calibration events were decisive in this study.

The common signatures of the signals (in terms of shapes, durations and spectral contents) associated to both events confirm the impulsive nature of the San Juan event. 2-D Full waveform modelling allows to explain the dispersion of this impulse signal at both stations, which duration and spectral features are directly controlled by depth- and range-dependent sound speed features along propagation paths. Concerning signal amplitudes, the absolute strength of each source cannot be drawn from the simple comparison of the received levels.

Thanks to one single broadband execution of the PMCC detector between 1 and 40 Hz, fine propagation details of direct arrivals are very well resolved in time-frequency space (decrease of trace velocity with time, duration of coherent arrivals, dispersion at H04S visible at detection level). In addition, thirteen late lower amplitude detections were revealed at H10N, often at background noise level. All reflectors to which they are associated have been identified: three are islands, four are seamounts and six are the South American continental slope. At H04S, only one possible reflection has been identified. Even if refraction process probably dominates on reflection process for arrivals associated to interaction from South American continental slope, results presented in this study are not affected by the nature of the reflection or refraction mechanism. In all cases, equivalent reflection points are considered.

The detection of this set of secondary arrivals is determinant for constraining source location and understanding San Juan source mechanism. First, the unambiguous identification of all the reflectors, coupled with the relevant use of only a well-chosen small subset of corresponding arrivals, has allowed to significantly improve the source location accuracy: 95% confidence ellipse area has been reduced by a factor of 100 compared to location obtained from direct arrivals only. From hydroacoustic data only, the final obtained location is 45.97°S, 59.81°W ± 29 km (direction WNW-ESE) and 15 km (direction SSW-NNE), 2017/11/15 13:51:05 ± 9 s. This location is 3.5 km from the found wreckage and its calculation did not require either the need of advanced oceanographic specifications, or sophisticated methods requiring heavy computational means.

In the last part of the paper, a detailed cepstral analysis of direct and secondary arrivals has allowed to highlight the anisotropic distribution of measured delays of a second arrival hidden in San Juan signals, and so to reject the hypothesis of an isotropic impulsive source. If we assume that the submarine was composed of two compartments, we interpret the first cepstral peak as a second event, and not as a subsequent bubble collapse following a detonation. From a double impulsive source model, we were able to infer that when the ARA San Juan acoustic event occurred, the submarine was cruising with a route of 356 ± 10° and that the source associated with the northern compartment of the submarine emitted an impulse 332 ± 5 ms before the source associated with the southern compartment. The two impulsive events could result from the successive implosions of the two compartments of the submarine. However, the source mechanism cannot be inferred from the hydroacoustic signals. The accuracy of these results is obtained from two very distant stations at 6000 and 8000 km, which deserves to be emphasized.

Further studies remain to be conducted on equivalent acoustic energy released by the two closely spaced impulsive sources, and the depths at which this acoustic event occurred. Using estimates of the local seismic magnitudes at TRQA station, the peak pressures recorded on the hydrophones, and the known charge weight for calibration event, Heyburn et al. (Heyburn et al. 2020) estimate that the San Juan event had an acoustic energy release equivalent to around 428  kg of trinitrotoluene.

In this paper, we show that even if the number of hydroacoustic stations of the IMS network is limited (6 total, worldwide), additional secondary arrivals from reflections on bathymetry help to artificially add virtual stations, especially for powerful impulsive sources. This clearly demonstrates the capability of the hydroacoustic component of the IMS network to accomplish its mission of Comprehensive Nuclear-Test-Ban Treaty monitoring.