Inferring CO2 saturation from synthetic surface seismic and downhole monitoring data using machine learning for leakage detection at CO2 sequestration sites

https://doi.org/10.1016/j.ijggc.2020.103115Get rights and content

Highlights

  • A machine learning workflow is developed for inferring CO2 saturation from surface seismic and downhole monitoring data.

  • The performance of multiple machine learning algorithms is assessed using Kappa statistics.

  • Surface seismic monitoring, coupled with downhole measurements, achieves higher accuracy of the CO2 saturation inversion.

  • The impact of seismic noise on the performance of the trained machine learning models is investigated.

Abstract

Inferring CO2 saturation from seismic data is important when seismic methods are applied at CO2 sequestration sites for verification and accounting purposes, such as verifying the total injected CO2 volume, comparing with model predictions for concordance evaluation, tracking the migration of CO2 plume, and detecting possible leakage from the storage reservoir. In this work, we infer CO2 saturation levels at three depths from simulated surface seismic, downhole pressure and total dissolved solids (TDS) data using machine learning (ML) methods. The simulated monitoring data are based on 6000 numerical multi-phase flow simulations of hypothetical wellbore CO2 and brine leakage from a legacy well into shallow aquifers at a model CO2 storage site. We conduct rock physics modeling to estimate changes in seismic velocity due to the simulated CO2 and brine leakage at each time step in the flow simulation outputs, resulting in 120,000 forward seismic velocity models. 2D finite-difference acoustic wave modeling is performed for each velocity model to generate synthetic shot gathers, along a sparse 2D seismic line with only 5 shots and 40 receivers. We extract 6 time-lapse seismic attribute anomalies from each trace in the time window relevant to each geologic layer, and use the seismic features, together with downhole pore pressure, TDS features to train the machine learning algorithms. The impact of seismic noise on the performance of the trained machine learning models has also been investigated. Inferred CO2 saturations from the trained classifiers are in good agreement with observations. Direct pressure and TDS measurements from downhole monitoring can increase the accuracy of the inferred CO2 saturation class from the forward modeled 2D surface seismic data. Our ML workflow represents a promising way to combine measurements from multiple monitoring techniques, together with seismic monitoring to achieve more accurate seismic quantitative interpretation.

Introduction

Carbon dioxide (CO2) capture, utilization and sequestration, in which CO2 is captured from emissions sources, compressed to a dense phase, and injected into deep geologic formations, is a promising approach to help mitigate atmospheric CO2. Monitoring and verification techniques are needed to ensure safe and long-term storage of CO2 in the deep subsurface. Operators will build 3D numerical geological models to describe the movement of CO2 in the underground storage complex, and to demonstrate the conformance of the reservoir storage model prediction with the monitoring data throughout the injection and post-injection periods (Lumley, 2010; USEPA, 2016). Both direct and indirect monitoring techniques have been proposed to track CO2 plume migration to ensure that CO2 is contained within the storage complex and does not interfere with other subsurface intervals of concern, such as groundwater aquifer (USEPA, 2013), faults or fracture systems (Zhang et al., 2015). Direct monitoring techniques, such as pressure monitoring and fluid sampling, provide point measurements of primary diagnostic parameters (such as pressure, solution chemistry and CO2 partial saturation) in the subsurface. Indirect monitoring techniques, usually geophysical approaches, such as seismic, electromagnetic and gravity monitoring techniques, generally sample over a relatively large area and provide volumetrically averaged representations of attributes of the sampled subsurface volume that correlate to primary diagnostic parameters of interest (Daley and Harbert, 2019; USEPA, 2016). Geophysical monitoring methods measure observable changes in the target framework and pore filling fluid parameters as imaged by the particular geophysical monitoring method used, rather than directly measuring the target parameters themselves. Therefore, inversion schemes are needed to infer the target observing parameters from the geophysical monitoring data. Time-lapse surface seismic monitoring is a widely used geophysical monitoring technique, which has been applied at CO2 sequestration sites for site characterization, imaging the injected CO2 plume and for conformance assessment (Andrew Chadwick and Noy, 2015; Chadwick et al., 2009; Ivandic et al., 2015; Lüth et al., 2015).

The conventional approach for CO2 saturation inversion from seismic data involves forward seismic modeling to find the relationship between CO2 saturation and seismic attributes (such as traveltime, reflector spectral composition or amplitude changes) with the CO2 saturation values in model domain then inverted to achieve the best fit between the observed and modeled changes in seismic attributes (Meadows, 2008). Chadwick et al. (Chadwick et al., 2004) estimated the total injected volume of CO2 at the Sleipner Field by inverting the recorded seismic data using the relationships between CO2 layer thickness with seismic wave reflection amplitude and traveltime differences, separately. Meadows (Meadows, 2008) improved the inversion work at Sleipner by simultaneous inversion of amplitude and traveltime differences. These inversion schemes rely on rock physics modeling and seismic modeling (often 1D approximations) to establish the relationships between the seismic attribute changes with the underlying pressure and fluid saturation changes. Use of simple 1D approximations requires the assumption that the relationships between the seismic attributes and the CO2 plume thickness established from the 1D modeling are applicable over the entire 3D geological models, not just the location for which the 1D model is designed. Furthermore, such models do not consider seismic noise, and the inversion results are inherently nonunique. Meadows and Cole (Meadows and Cole, 2013) conducted pressure and CO2 saturation inversion using P- and S-impedance changes calculated from processed 4D seismic data at the Weyburn Field, Saskatchewan. They found that it was more difficult to invert for CO2 saturation changes than for effective pressure changes and the inversion results for CO2 saturation changes were noisier, relative to those for pressure changes. Gunning and Glinsky (Gunning and Glinsky, 2004) developed a trace-based inversion routine for conducting Bayesian seismic inversion using Markov Chain Monte Carlo methods. The inverted stochastic model parameters include the fluid type and saturation in each layer. However, the assumed prior distribution of the input model parameters and the layer times initialized in the prior layer stack model can substantially affect the inversion results.

Machine learning techniques, such as Artificial Neural Network (ANN) and Support Vector Machine (SVM), which find patterns in data to make predictions based on statistics, have been used in various fields of environmental science (Bertin et al., 2013; Reid et al., 2015; Saghafi and Arabloo, 2017). Reading et al. (Reading, 2015) used the outputs from machine learning supervised classifications prior to geophysical inversion, which improves detail in the inverted 3D geological models. Liu (Liu, 2017) proposed a machine learning workflow for seismic quantitative interpretation using the Bayesian-based Support Vector Regression, unsupervised Self-Organizing Map and supervised Support Vector Classification algorithms.

Machine learning techniques have also been used to invert for seismic velocity models directly from raw seismic data (Araya-Polo et al., 2017; Lin and Wu, 2018). In the work by Araya-Polo et al., velocity analysis semblance panels (location-specific seismic data processing products designed to increase the resolution of interpretabel signal in the presence of background noise) at multiple commom midpoint locations were used as input features in the Deep Neural Network (DNN) algorithm to produce seismic velocity models. Those previous studies used machine learning methods to invert for the seismic velocity model, which can be considered as a proxy or representation of the resulting seismic image of the CO2 plume when applied to carbon sequestration sites. To our knowledge, there has been no study that considers inverting to infer CO2 saturation directly from seismic data using machine learning techniques.

Our objective in this study is to use a large collection of labeled forward models to infer leakage and supercritical and gaseous CO2 saturation levels at deeper and shallower depths along a leaky well from sparse 2D synthetic surface seismic, downhole pressure, and aqueous total dissolved solids (TDS) data using machine learning algorithms. In this analysis, we compare the performance of multiple machine learning algorithms and identify important features in the machine learning models using statistical methods. The impact of seismic noise on the performance of the machine learning classifier is also investigated.

Section snippets

Setting

To explore the problem of inferring leakage detection and leak size classification from monitoring observations, we consider hypothetical well leakage scenarios for a candidate geologic carbon storage site near Kimberlina, CA, USA. We begin by using detailed sets of 3D numerical multi-phase flow simulations of wellbore CO2 and brine leakage from a legacy well into shallow aquifers developed previously by researchers at the Lawrence Livermore National Laboratory (LLNL) based on the Kimberlina

Results

The Kappa statistics (inter-rater agreement of categorical data) of the machine learning algorithms trained and evaluated using all the features and using only the seismic features are compared in Fig. 6a. For ML models trained using only the seismic features (represented by filled circle symbols in Fig. 6a, the best-performing models are linearSVM (at shallow depth), RNN (at intermediate depth) and SVMr (at deep depth). For ML models trained using all the features (represented by filled plus

Discussion

Our study utilizes a large training and testing dataset generated by 120,000 seismic wave propagation simulations and demonstrates the usefulness of using machine learning methods to infer CO2 saturation levels from surface seismic monitoring and downhole pressure, TDS measurements. We evaluate the performance of the machine learning models on the testing set, in which the underlying assumption for the permeability distributions of the three geologic layers is different than that in the

Conclusions

In this work, we have developed a machine learning workflow to infer CO2 saturation levels from simulated monitoring data, including surface seismic monitoring and downhole pressure, TDS measurements. We have compared the performance of multiple machine learning algorithms and identified important features in the machine learning models for CO2 saturation inversion using statistical criteria. We have also compared the relative importance of seismic features vs. direct downhole pressure and TDS

Disclaimer

This paper was prepared as an account of work sponsored by an agency of the United States Government. Neither the United States Government nor any agency thereof, nor any of their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference therein to any specific

CRediT authorship contribution statement

Zan Wang: Conceptualization, Methodology, Software, Formal analysis, Writing - original draft. Robert M. Dilmore: Conceptualization, Resources, Writing - review & editing, Supervision, Project administration. William Harbert: Conceptualization, Resources, Writing - review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

This technical effort was supported in part by an appointment to the National Energy Technology Laboratory (NETL) Research Participation Program, sponsored by the U.S. Department of Energy and administered by the Oak Ridge Institute for Science and Education. We would like to thank Kayyum Mansoor and Tom Buscheck at Lawrence Livermore National Laboratory for sharing Kimberlina v1.2 flow simulation results. We would like to thank Dave Rampton at NETL and Carnegie Mellon University for his

References (42)

  • M. Araya-Polo et al.

    Deep-learning tomography

    Proc. Int. Conf. Lead. Edge Manuf.

    (2017)
  • M.J. Bertin et al.

    Using machine learning tools to model complex toxic interactions with limited sampling regimes

    Environ. Sci. Technol.

    (2013)
  • M.A. Biot

    General theory of three-dimensional consolidation

    J. Appl. Phys.

    (1941)
  • J.S. Bridle

    Training stochastic model recognition algorithms as networks can lead to maximum mutual information estimation of parameters

    Adv. Neural Inf. Process. Syst.

    (1990)
  • T.A. Buscheck et al.

    Simulated Data for Testing Monitoring Techniques to Detect Leakage in Groundwater Resources: Kimberlina Model With Wellbore Leakage, Rev. 1.1; LLNL-TR-731055. Livermore, CA

    (2017)
  • R.A. Chadwick et al.

    Underground CO2 storage: demonstrating regulatory conformance by convergence of history-matched modeled and observed CO2 plume behavior using Sleipner time-lapse seismics

    Greenh. Gases Sci. Technol.

    (2015)
  • R.A. Chadwick et al.

    4D seismic imaging of an injected CO2 plume at the Sleipner Field, Central North Sea

    Geol. Soc. London, Mem.

    (2004)
  • Y. Chen et al.

    Ground-roll noise attenuation using a simple and effective approach based on local bandlimited orthogonalization

    Ieee Geosci. Remote. Sens. Lett.

    (2015)
  • K.R. Chowdhury

    Seismic data acquisition and processing

  • J. Cohen

    A coefficient of agreement for nominal scales

    Educ. Psychol. Meas.

    (1960)
  • C. Cortes et al.

    Support-vector networks

    Mach. Learn.

    (1995)
  • Cited by (23)

    • Deep learning for characterizing CO<inf>2</inf> migration in time-lapse seismic images

      2023, Fuel
      Citation Excerpt :

      Deep learning-based CO2 identification has also achieved many successful cases in recent years. Many geophysicists utilize different deep-learning-based methods to investigate the relationship between permeability, porosity, and phase saturations, and further predict the CO2 migration or the potential CO2 leakage risk [60–63]. Meanwhile, some geophysicists develop deep-learning-based methods that can characterize CO2 migration in seismic data.

    • Application of machine learning in carbon capture and storage: An in-depth insight from the perspective of geoscience

      2023, Fuel
      Citation Excerpt :

      Fig. 8 shows the integrated application of CNN and LSTM in leakage monitoring. Wang et al. [122] proposed the view of using ML methods such as SVM, RNN and DNN to monitor CO2 leakage at different depth locations, which combined seismic monitoring data with downhole pressure data and total dissolved solids (TDS) measurements to achieve more accurate prediction of CO2 saturation. SVM and SVR also can be used to predict the CO2 trapping efficiency, which have a considerable impact on the CO2 diffusion and leakage.

    • Modeling CO<inf>2</inf> migration in a site-specific shallow subsurface under complex hydrodynamics

      2021, International Journal of Greenhouse Gas Control
      Citation Excerpt :

      Understanding the transport processes of a CO2 plume or identifying the location and size of a potential leak through a site-specific simulation, can inform way the monitoring and verification(M&V) program and target the highest risk locations. Research methods mainly include field-scale or laboratory-scale experiments, numerical simulations, and other unconventional method, such as machine learning (Wang et al., 2020) and generative adversarial network(GAN)(Zhong et al., 2019) approach to detect or predict the CO2 leakage process. Notable work in field experiments is the Zero Emission Research and Technology collaborative (ZERT) (Spangler et al., 2010) of the Montana State University (United States), which has carried out several tests between 2007 and 2014.

    View all citing articles on Scopus
    View full text