Improving the pulsed neutron-gamma density method with machine learning regression algorithms

https://doi.org/10.1016/j.petrol.2022.110962Get rights and content

Highlights

  • Machine learning improves the accuracy and generalization of density prediction.

  • Machine learning regressors greatly simplify the borehole correction process.

  • Machine learning regressors can complete borehole correction without the standoff.

Abstract

Formation bulk density is one of the critical parameters for formation evaluation. As a safe and environmentally friendly method, the pulsed neutron-gamma density (NGD) measurement is emerging as an alternative to the traditional chemical source-based gamma-gamma density (GGD) measurement. However, in the NGD measurement, the initial energy spectrum, source intensity, and spatial distribution of the secondary inelastic gamma-ray source vary with formation components. Moreover, the energy of inelastic gamma rays is at the MeV level, leading to a non-negligible pair production effect. All these factors affect the accuracy of the bulk-density measurement. In addition, the impact of borehole configuration on NGD is evident, and correction for borehole effects is essential. To improve density accuracy: first, we forgo exploring an explicit theoretical formula for density calculation and instead treat the NGD mathematically as a regression problem and introduce the machine learning regressor, a powerful and popular tool for solving regression problems, into the NGD for the first time; second, we select features less affected by changes in formation chemical composition as input features. Our results show that the final three tuned machine learning regressors selected from the 39 candidate regressors outperform the optimized polynomial model (an optimization of conventional density calculation models) in both accuracy and generalization ability. They complete density prediction and borehole correction in one step, avoiding complex exploration of borehole effects as in the optimized polynomial model and dramatically simplifying the borehole correction process. Moreover, the GaussianProcessRegressor can complete borehole correction without the standoff information and perform well, which is impossible for the optimized polynomial model, broadening application scenarios.

Introduction

There are two nuclear measurements to obtain the formation bulk density: the traditional chemical source-based gamma-gamma density (GGD) measurement and the source-free neutron-gamma density (NGD) measurement. The GGD measurement employs the radioisotope 137Cs as the gamma-ray source (Ellis and Singer, 2007), which poses a significant HSE (Health, Safety, and Environment) risk (Aitken et al., 2002; Badruzzaman et al., 2009; Kurkoski et al., 1991). In contrast, the NGD measurement uses an electrically controllable pulsed neutron generator (PNG) (Navarro and Guo, 2016), which reduces the HSE risk greatly and enables deeper detection (Archer et al., 1999; Badruzzaman, 2014; Luycx and Torres-Verdín, 2017, 2019). It is emerging as the future trend to replace the chemical source-based GGD measurement, especially in logging-while-drilling (LWD) applications.

Although both GGD and NGD measurements are based on the attenuation of gamma rays to obtain the bulk density, the factors affecting their respective gamma-ray attenuation are not the same. The gamma-ray source of GGD is a point radioisotope source, Cesium-137, emitting monoenergetic gamma rays of 0.662 MeV. Compton scattering dominates the decay of gamma rays, and the detected gamma rays count rate is linearly related to the bulk density (in logarithmic form) (Ellis and Singer, 2007). In contrast, the gamma-ray source in NGD are secondary gamma-rays generated by inelastic scattering of fast neutrons with formation material. The initial energy spectrum, intensity, and spatial distribution (shape) of these secondary inelastic gamma-rays are affected by fast neutrons and the chemical composition of the formation (Inanc, 2014). It is no longer a constant-flux source. Some scholars compensate for the effect of variation in neutron flux on the distributed secondary inelastic gamma-ray source by additional fast neutron or epithermal neutron measurements (Evans et al., 2012; Gjerdingen et al., 2012; Jacobson et al., 2004; Luycx and Torres-Verdín, 2017, 2019; Odom et al., 1999, 2001; Reichel et al., 2011, 2013). In addition, since most formation materials' characteristic energy of inelastic gamma rays is at the MeV level, the pair production cannot be ignored during the inelastic gamma-ray decay. Therefore, the relationship between density and inelastic gamma-ray attenuation is much more complex in the NGD measurement than in the GGD measurement.

Wang (Wang et al., 2020) reduces the pair production effect according to the difference of inelastic gamma-rays in high and low energy windows (0.7–4 and 2–8 MeV). It is well known that gamma rays with the same initial energy spectrum have different pair production mass attenuation coefficients (μm-pa) in different formation chemical components, forming an intuitive impression that formation chemical composition changes may affect gamma-ray attenuation and thus density prediction through μm-pa. However, this is far from the truth. In this paper, we analyze the effect of changes in the formation chemical component on the pair production mass attenuation coefficient and the total mass attenuation coefficient and find that the relationship between formation chemical component changes and mass attenuation coefficients is so complex that it is challenging to analyze thoroughly. Considering that our goal is to predict density with inelastic gamma-ray features and fast neutron features, we shifted to directly analyze the effect of formation chemical composition changes on these features when the formation density is constant using the Monte Carlo N-Particle code (MCNP), the industry standard for nuclear modeling (Goorley, 2014), and select the less affected features as input features. This shift avoids the complex analysis of the effect of formation component changes on mass attenuation coefficients. It provides a basis for feature selection that is instructive for practical applications. Luycx's results (Luycx and Torres-Verdín, 2017, 2019) also demonstrate that the polynomial model built using these features is less affected by formation chemical composition changes.

The fundamental of NGD lies in the attenuation law of the secondary inelastic gamma-ray source. Zhang (Zhang et al., 2017) derived the inelastic gamma-ray flux based on the fast neutron-gamma coupled field theory and a density calculation model. Luycx (Luycx and Torres-Verdín, 2019) derived another density calculation model by directly analyzing the fundamental law of gamma-ray attenuation. Wang (Wang et al., 2019, 2020) also derived the inelastic gamma-ray flux based on the neutron multigroup diffusion equation and gamma-ray diffusion theory and proposed a nonlinear model for density calculation. These models are termed polynomial models because their inelastic gamma-ray terms and fast neutron terms are in polynomial forms (including linear forms). In this paper, a very natural extension of these three models have been tested, and the model that performed best on both the training and test sets is selected as the optimized polynomial model. Based on the flux equations for the inelastic gamma rays and fast neutrons, we derive the equation satisfied by the bulk density, inelastic gamma-ray features, and fast neutron features. However, it is challenging to derive the equation to calculate bulk-density from this set of equations. So we gave up trying to find an analytical equation for density calculation and instead treat NGD mathematically as a regression problem and for the first time, introduce the machine learning regressor, a powerful and popular tool for solving regression problems, to predict bulk-density.

Previous studies (Luycx and Torres-Verdín, 2017, 2019; Mickael et al., 2002; Wang et al., 2020) have shown that borehole parameters such as diameter (BHD), standoff (STD), and mud weight all affect the NGD measurement. In this paper, we explore the borehole correction of the optimized polynomial model in a more systematic manner and obtain a good correction. In contrast, the machine learning regressor can perform density prediction and borehole correction in one single step, avoiding the complex exploration of borehole effects as in the optimized polynomial model and greatly simplifying the process of building a density prediction model which includes borehole correction. Moreover, it outperforms the optimized polynomial model after borehole correction in accuracy and generalization capability.

In this paper, we use the root mean squared error (RMSE) of the training set (RMSE_train) and test set (RMSE_test) to evaluate the accuracy and generalization ability of the model, respectively. We have tested 32 single regressors and 7 integrated regressors with default hyperparameters using the Scikit-learn API. We selected the best performers from these 39 regressors and subsequently tuned their hyperparameters to improve performance even further. The results show that the (RMSE_train, RMSE_test) of the tuned GaussianProcessRegressor, ExtraTreesRegressor and NuSVR are (0, 0.0328), (0, 0.0439) and (0.0367, 0.0340) respectively, which are all better than (0.0493, 0.0447) for the borehole-corrected optimized polynomial model. Moreover, even without the standoff information, the GaussianProcessRegressor can still achieve good performance with (RMSE_train, RMSE_test) of (0, 0.0453), which is impossible for the optimized polynomial model, expanding the applicability of the density prediction model.

Section snippets

Physics and feature selection

In a homogeneous spherical formation model containing a point-source for fast neutron, the flux of fast neutrons φfn (Tittle, 1961) and the flux of inelastic gamma rays φinγ can be described as follows (Wang et al., 2020)φfn=Sfn4πDfnrer/Lfnφinγ=iinSfn4πDγDfnR(1/Lfn)2(1/Lfn)2(1/Lγ)2(eR/LγeR/Lfn)where Sfn is the fast neutron source strength, Lfn and Dfn are the diffusion length and diffusion coefficient of the fast neutron, respectively, Lγ and Dγ are the diffusion length and diffusion

Pair production and effects of formation chemical composition changes

Before proceeding to the following analysis, it is necessary to state that two properties change when a formation changes from one lithology to another: the bulk density and the formation's chemical composition. Changes in a formation's chemical composition can be due to (1) changes in elemental composition, which may be caused by changes in matrix type or pore fluid type, or (2) changes in the abundance of each element, which are caused by changes in the relative volume of the matrix and pore

NGD tool and data design

Fig. 8 shows the schematic diagram of the MCNP model of the NGD instrument, which includes two cerium-doped lanthanum bromide [LaBr3(Ce)] gamma-ray detectors and two fast neutron detectors. The pulsed neutron source produces 14.1 MeV fast neutrons within a burst interval of 20 μs. All raw data in this paper are simulated by MCNP, and no instrument measured data. However, this does not hinder the feasibility of the method in this paper. We have generated three data sets (1944 points in total),

Analysis of machine learning regressors in the NGD

From the analysis in Part 2, the neutron-gamma density (NGD) estimation can be viewed as a regression algorithm for predicting bulk density using inelastic gamma-ray features and fast neutron features. As a powerful and popular tool for solving regression problems, this paper introduces machine learning regressors into NGD (Fig. 14 shows the workflow of applying a machine learning regressor to NGD). First, the performance of 32 single regressors and seven ensemble regressors is tested using the

Conclusion

This paper treats the Neutron-Gamma Density (NGD) estimation as a regression problem using inelastic gamma-ray features and fast neutron features to predict bulk density. Machine learning-based regression algorithms are introduced to solve this problem. Our results show that the machine learning regressor completes density prediction and borehole correction in one step, which significantly simplifies the process of building a density prediction model which takes borehole correction into account

CRediT authorship contribution statement

Duo Dong: Conceptualization, Methodology, Writing – original draft, Visualization, participated in the data calculation and analysis, and experimental program writing. Wensheng Wu: Conceptualization, Methodology. Wenzheng Yue: Writing – review & editing. Yunlong Ge: Data calculation, Writing – review & editing. Shitao Xiong: Data calculation, Writing – review & editing. Wenqi Zhao: Writing – review & editing. Ruifeng Wang: Writing – review & editing.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

References (53)

  • A. Badruzzaman

    An assessment of fundamentals of nuclear-based alternatives to conventional chemical source bulk density measurement

    Petrophysics - The SPWLA Journal of Formation Evaluation and Reservoir Description

    (2014)
  • A. Badruzzaman et al.

    Radioactive Sources in Petroleum Industry: Applications, Concerns and Alternatives, Asia Pacific Health, Safety, Security and Environment Conference

    (2009)
  • L. Breiman

    Random forests

    Mach. Learn.

    (2001)
  • L. Buitinck

    API Design for Machine Learning Software: Experiences from the Scikit-Learn Project

    (2013)
  • C.-C. Chang et al.

    Training v-support vector regression: theory and algorithms

    Neural Comput.

    (2002)
  • C.-C. Chang et al.

    LIBSVM: a library for support vector machines

    ACM Trans. Intelligent Syst Technol. (TIST)

    (2011)
  • T. Chen et al.

    Xgboost: a scalable tree boosting system

  • V. Cherkassky et al.

    Selection of meta-parameters for support vector regression

  • R.C.E.W. Cki

    Gaussian processes for machine learning

    Int. J. Neural Syst.

    (2006)
  • S.S. Desai et al.

    Estimation of regression parameters using SVM with new methods for meta parameter

    Int. J. Data Min. Model. Manag.

    (2015)
  • D.V. Ellis et al.

    Well Logging for Earth Scientists

    (2007)
  • M. Evans

    Sourceless Neutron-Gamma Density (SNGD): A Radioisotope-free Bulk Density Measurement: Physics Principles, Environmental Effects, and Applications

    (2012)
  • P. Geurts et al.

    Extremely randomized trees

    Mach. Learn.

    (2006)
  • T. Gjerdingen

    Sourceless Neutron-Density Porosity Determination: Fit-For-Purpose Formation Evaluation with Significant HS&E Benefits

    (2012)
  • T. Goorley

    MCNP6.1.1-Beta Release Notes

    (2014)
  • P.-Y. Hao

    Pair-${v} $-SVR: a novel and efficient pairing nu-support vector regression algorithm

    IEEE Transact. Neural Networks Learn. Syst.

    (2016)
  • Cited by (5)

    • An insight into the microorganism growth prediction by means of machine learning approaches

      2023, Journal of Petroleum Science and Engineering
      Citation Excerpt :

      Jeanne-Marie Membré studied the growth of Bacillus cereus, Clostridium perfringens, Escherichia, Salmonella, Listeria monocytogenes in different temperature conditions (Membré et al., 2005). In recent years, machine learning methods have shown wide applications in different petroleum topics (Chahar et al., 2022; Dong et al., 2022; Prochnow et al., 2022; Zhao et al., 2022a, 2022b), therefore, using these methods in MEOR projects can lead to interesting results. In the literature, there is no comprehensive study on applications and comparison of different machine learning methods in MEOR.

    View full text