Improving the pulsed neutron-gamma density method with machine learning regression algorithms

doi:10.1016/j.petrol.2022.110962

Journal of Petroleum Science and Engineering

Volume 218, November 2022, 110962

https://doi.org/10.1016/j.petrol.2022.110962 Get rights and content

Highlights

•
Machine learning improves the accuracy and generalization of density prediction.
•
Machine learning regressors greatly simplify the borehole correction process.
•
Machine learning regressors can complete borehole correction without the standoff.

Abstract

Formation bulk density is one of the critical parameters for formation evaluation. As a safe and environmentally friendly method, the pulsed neutron-gamma density (NGD) measurement is emerging as an alternative to the traditional chemical source-based gamma-gamma density (GGD) measurement. However, in the NGD measurement, the initial energy spectrum, source intensity, and spatial distribution of the secondary inelastic gamma-ray source vary with formation components. Moreover, the energy of inelastic gamma rays is at the MeV level, leading to a non-negligible pair production effect. All these factors affect the accuracy of the bulk-density measurement. In addition, the impact of borehole configuration on NGD is evident, and correction for borehole effects is essential. To improve density accuracy: first, we forgo exploring an explicit theoretical formula for density calculation and instead treat the NGD mathematically as a regression problem and introduce the machine learning regressor, a powerful and popular tool for solving regression problems, into the NGD for the first time; second, we select features less affected by changes in formation chemical composition as input features. Our results show that the final three tuned machine learning regressors selected from the 39 candidate regressors outperform the optimized polynomial model (an optimization of conventional density calculation models) in both accuracy and generalization ability. They complete density prediction and borehole correction in one step, avoiding complex exploration of borehole effects as in the optimized polynomial model and dramatically simplifying the borehole correction process. Moreover, the GaussianProcessRegressor can complete borehole correction without the standoff information and perform well, which is impossible for the optimized polynomial model, broadening application scenarios.

Introduction

There are two nuclear measurements to obtain the formation bulk density: the traditional chemical source-based gamma-gamma density (GGD) measurement and the source-free neutron-gamma density (NGD) measurement. The GGD measurement employs the radioisotope ¹³⁷Cs as the gamma-ray source (Ellis and Singer, 2007), which poses a significant HSE (Health, Safety, and Environment) risk (Aitken et al., 2002; Badruzzaman et al., 2009; Kurkoski et al., 1991). In contrast, the NGD measurement uses an electrically controllable pulsed neutron generator (PNG) (Navarro and Guo, 2016), which reduces the HSE risk greatly and enables deeper detection (Archer et al., 1999; Badruzzaman, 2014; Luycx and Torres-Verdín, 2017, 2019). It is emerging as the future trend to replace the chemical source-based GGD measurement, especially in logging-while-drilling (LWD) applications.

Although both GGD and NGD measurements are based on the attenuation of gamma rays to obtain the bulk density, the factors affecting their respective gamma-ray attenuation are not the same. The gamma-ray source of GGD is a point radioisotope source, Cesium-137, emitting monoenergetic gamma rays of 0.662 MeV. Compton scattering dominates the decay of gamma rays, and the detected gamma rays count rate is linearly related to the bulk density (in logarithmic form) (Ellis and Singer, 2007). In contrast, the gamma-ray source in NGD are secondary gamma-rays generated by inelastic scattering of fast neutrons with formation material. The initial energy spectrum, intensity, and spatial distribution (shape) of these secondary inelastic gamma-rays are affected by fast neutrons and the chemical composition of the formation (Inanc, 2014). It is no longer a constant-flux source. Some scholars compensate for the effect of variation in neutron flux on the distributed secondary inelastic gamma-ray source by additional fast neutron or epithermal neutron measurements (Evans et al., 2012; Gjerdingen et al., 2012; Jacobson et al., 2004; Luycx and Torres-Verdín, 2017, 2019; Odom et al., 1999, 2001; Reichel et al., 2011, 2013). In addition, since most formation materials' characteristic energy of inelastic gamma rays is at the MeV level, the pair production cannot be ignored during the inelastic gamma-ray decay. Therefore, the relationship between density and inelastic gamma-ray attenuation is much more complex in the NGD measurement than in the GGD measurement.

Wang (Wang et al., 2020) reduces the pair production effect according to the difference of inelastic gamma-rays in high and low energy windows (0.7–4 and 2–8 MeV). It is well known that gamma rays with the same initial energy spectrum have different pair production mass attenuation coefficients (μ_m-pa) in different formation chemical components, forming an intuitive impression that formation chemical composition changes may affect gamma-ray attenuation and thus density prediction through μ_m-pa. However, this is far from the truth. In this paper, we analyze the effect of changes in the formation chemical component on the pair production mass attenuation coefficient and the total mass attenuation coefficient and find that the relationship between formation chemical component changes and mass attenuation coefficients is so complex that it is challenging to analyze thoroughly. Considering that our goal is to predict density with inelastic gamma-ray features and fast neutron features, we shifted to directly analyze the effect of formation chemical composition changes on these features when the formation density is constant using the Monte Carlo N-Particle code (MCNP), the industry standard for nuclear modeling (Goorley, 2014), and select the less affected features as input features. This shift avoids the complex analysis of the effect of formation component changes on mass attenuation coefficients. It provides a basis for feature selection that is instructive for practical applications. Luycx's results (Luycx and Torres-Verdín, 2017, 2019) also demonstrate that the polynomial model built using these features is less affected by formation chemical composition changes.

The fundamental of NGD lies in the attenuation law of the secondary inelastic gamma-ray source. Zhang (Zhang et al., 2017) derived the inelastic gamma-ray flux based on the fast neutron-gamma coupled field theory and a density calculation model. Luycx (Luycx and Torres-Verdín, 2019) derived another density calculation model by directly analyzing the fundamental law of gamma-ray attenuation. Wang (Wang et al., 2019, 2020) also derived the inelastic gamma-ray flux based on the neutron multigroup diffusion equation and gamma-ray diffusion theory and proposed a nonlinear model for density calculation. These models are termed polynomial models because their inelastic gamma-ray terms and fast neutron terms are in polynomial forms (including linear forms). In this paper, a very natural extension of these three models have been tested, and the model that performed best on both the training and test sets is selected as the optimized polynomial model. Based on the flux equations for the inelastic gamma rays and fast neutrons, we derive the equation satisfied by the bulk density, inelastic gamma-ray features, and fast neutron features. However, it is challenging to derive the equation to calculate bulk-density from this set of equations. So we gave up trying to find an analytical equation for density calculation and instead treat NGD mathematically as a regression problem and for the first time, introduce the machine learning regressor, a powerful and popular tool for solving regression problems, to predict bulk-density.

Previous studies (Luycx and Torres-Verdín, 2017, 2019; Mickael et al., 2002; Wang et al., 2020) have shown that borehole parameters such as diameter (BHD), standoff (STD), and mud weight all affect the NGD measurement. In this paper, we explore the borehole correction of the optimized polynomial model in a more systematic manner and obtain a good correction. In contrast, the machine learning regressor can perform density prediction and borehole correction in one single step, avoiding the complex exploration of borehole effects as in the optimized polynomial model and greatly simplifying the process of building a density prediction model which includes borehole correction. Moreover, it outperforms the optimized polynomial model after borehole correction in accuracy and generalization capability.

In this paper, we use the root mean squared error (RMSE) of the training set (RMSE_train) and test set (RMSE_test) to evaluate the accuracy and generalization ability of the model, respectively. We have tested 32 single regressors and 7 integrated regressors with default hyperparameters using the Scikit-learn API. We selected the best performers from these 39 regressors and subsequently tuned their hyperparameters to improve performance even further. The results show that the (RMSE_train, RMSE_test) of the tuned GaussianProcessRegressor, ExtraTreesRegressor and NuSVR are (0, 0.0328), (0, 0.0439) and (0.0367, 0.0340) respectively, which are all better than (0.0493, 0.0447) for the borehole-corrected optimized polynomial model. Moreover, even without the standoff information, the GaussianProcessRegressor can still achieve good performance with (RMSE_train, RMSE_test) of (0, 0.0453), which is impossible for the optimized polynomial model, expanding the applicability of the density prediction model.

Section snippets

Physics and feature selection

In a homogeneous spherical formation model containing a point-source for fast neutron, the flux of fast neutrons φ_fn (Tittle, 1961) and the flux of inelastic gamma rays φ_inγ can be described as follows (Wang et al., 2020) $φ_{f n} = \frac{S_{f n}}{4 π D_{f n} r} e^{- r / L_{f n}}$ $φ_{i n γ} = \frac{i_{i n} S_{f n}}{4 π D_{γ} D_{f n} R} \frac{{(1 / L_{f n})}^{2}}{{(1 / L_{f n})}^{2} - {(1 / L_{γ})}^{2}} (e^{- R / L_{γ}} - e^{- R / L_{f n}})$ where S_fn is the fast neutron source strength, L_fn and D_fn are the diffusion length and diffusion coefficient of the fast neutron, respectively, L_γ and D_γ are the diffusion length and diffusion

Pair production and effects of formation chemical composition changes

Before proceeding to the following analysis, it is necessary to state that two properties change when a formation changes from one lithology to another: the bulk density and the formation's chemical composition. Changes in a formation's chemical composition can be due to (1) changes in elemental composition, which may be caused by changes in matrix type or pore fluid type, or (2) changes in the abundance of each element, which are caused by changes in the relative volume of the matrix and pore

NGD tool and data design

Fig. 8 shows the schematic diagram of the MCNP model of the NGD instrument, which includes two cerium-doped lanthanum bromide [LaBr3(Ce)] gamma-ray detectors and two fast neutron detectors. The pulsed neutron source produces 14.1 MeV fast neutrons within a burst interval of 20 μs. All raw data in this paper are simulated by MCNP, and no instrument measured data. However, this does not hinder the feasibility of the method in this paper. We have generated three data sets (1944 points in total),

Analysis of machine learning regressors in the NGD

From the analysis in Part 2, the neutron-gamma density (NGD) estimation can be viewed as a regression algorithm for predicting bulk density using inelastic gamma-ray features and fast neutron features. As a powerful and popular tool for solving regression problems, this paper introduces machine learning regressors into NGD (Fig. 14 shows the workflow of applying a machine learning regressor to NGD). First, the performance of 32 single regressors and seven ensemble regressors is tested using the

Conclusion

This paper treats the Neutron-Gamma Density (NGD) estimation as a regression problem using inelastic gamma-ray features and fast neutron features to predict bulk density. Machine learning-based regression algorithms are introduced to solve this problem. Our results show that the machine learning regressor completes density prediction and borehole correction in one step, which significantly simplifies the process of building a density prediction model which takes borehole correction into account

CRediT authorship contribution statement

Duo Dong: Conceptualization, Methodology, Writing – original draft, Visualization, participated in the data calculation and analysis, and experimental program writing. Wensheng Wu: Conceptualization, Methodology. Wenzheng Yue: Writing – review & editing. Yunlong Ge: Data calculation, Writing – review & editing. Shitao Xiong: Data calculation, Writing – review & editing. Wenqi Zhao: Writing – review & editing. Ruifeng Wang: Writing – review & editing.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

References (53)

A. Chalimourda et al.
Experimentally optimal ν in support vector regression for different noise models and parameter settings
Neural Network.
(2004)
V. Cherkassky et al.
Practical selection of SVM parameters and noise estimation for SVM regression
Neural Network.
(2004)
J. Guo
An XGBoost-based physical fitness evaluation model using advanced feature selection and Bayesian hyper-parameter optimization for wearable running monitoring
Comput. Network.
(2019)
G.A. Lujan-Moreno et al.
Design of experiments and response surface methodology to tune machine learning hyperparameters, with a random forest case-study
Expert Syst. Appl.
(2018)
R. Shi et al.
Prediction and analysis of train arrival delay based on XGBoost and Bayesian optimization
Appl. Soft Comput.
(2021)
H. Wang et al.
Neutron transport correction and density calculation in the neutron-gamma density logging
Appl. Radiat. Isot.
(2019)
Y. Xia et al.
A boosted decision tree approach using Bayesian hyper-parameter optimization for credit scoring
Expert Syst. Appl.
(2017)
J.D. Aitken
Radiation sources in drilling tools: comprehensive risk analysis in the design, development and operation of LWD tools
M.P. Archer et al.
Pulsed Neutron Density Measurements: Modeling the Depth of Investigation and Cased-Hole Wellbore Uncertainties
(1999)
M. Awad et al.
Support Vector Regression, Efficient Learning Machines
(2015)

A. Badruzzaman

An assessment of fundamentals of nuclear-based alternatives to conventional chemical source bulk density measurement

Petrophysics - The SPWLA Journal of Formation Evaluation and Reservoir Description

(2014)

A. Badruzzaman et al.

Radioactive Sources in Petroleum Industry: Applications, Concerns and Alternatives, Asia Pacific Health, Safety, Security and Environment Conference

(2009)

L. Breiman

Random forests

Mach. Learn.

(2001)

L. Buitinck

API Design for Machine Learning Software: Experiences from the Scikit-Learn Project

(2013)

C.-C. Chang et al.

Training v-support vector regression: theory and algorithms

Neural Comput.

(2002)

C.-C. Chang et al.

LIBSVM: a library for support vector machines

ACM Trans. Intelligent Syst Technol. (TIST)

(2011)

T. Chen et al.

Xgboost: a scalable tree boosting system

V. Cherkassky et al.

Selection of meta-parameters for support vector regression

R.C.E.W. Cki

Gaussian processes for machine learning

Int. J. Neural Syst.

(2006)

S.S. Desai et al.

Estimation of regression parameters using SVM with new methods for meta parameter

Int. J. Data Min. Model. Manag.

(2015)

D.V. Ellis et al.

Well Logging for Earth Scientists

(2007)

M. Evans

Sourceless Neutron-Gamma Density (SNGD): A Radioisotope-free Bulk Density Measurement: Physics Principles, Environmental Effects, and Applications

(2012)

P. Geurts et al.

Extremely randomized trees

Mach. Learn.

(2006)

T. Gjerdingen

Sourceless Neutron-Density Porosity Determination: Fit-For-Purpose Formation Evaluation with Significant HS&E Benefits

(2012)

T. Goorley

MCNP6.1.1-Beta Release Notes

(2014)

P.-Y. Hao

Pair-${v} $-SVR: a novel and efficient pairing nu-support vector regression algorithm

IEEE Transact. Neural Networks Learn. Syst.

(2016)

Cited by (5)

A novel constraint-based method for density measurement in cased hole
2023, Geoenergy Science and Engineering
In recent years, thru-casing measurement has shown high application values in controlling risk and improving production efficiency. However, in cased-hole applications, the investigation depth of gamma density logging is significantly confined due to the impacts of casing and cement, resulting in decreased accuracy of density measurement. Furthermore, the common uncertainty associated with cement bonding quality increases the difficulty of quantitative density evaluation. In order to overcome these challenges, a new thru-casing density measurement method is proposed. The method consists of three aspects: 1) casing correction; 2) simultaneous formation and cement density calculation based on cement constraint characterization; 3) cement thickness correction. The key to the method is step 2), where cement is defined by a set of physical constraints. e.g., volumetric fractions of contaminants such as water, mud, gas. The identification of constraints enables efficient sampling in the constrained phase space and provides the optimal solution to both formation density and cement density. To obtain precise and scenario-based cement constraints, Naive Bayes classifier combined with sliding window technique is used to provide prior information on cement quality. Geant4 is employed to simulate gamma density tools and obtain their responses in various formations, casings, and cements. These models are capable of meeting statistical error requirements. Since the simulation parameters are well-defined, it is useful for validating the proposed method using Monte Carlo simulation data under varying casing, cement, and formation conditions. The method can effectively obtain formation and cement density, as indicated by the calculation error of less than 0.02 g/cm³ and 0.025 g/cm³, respectively. The method is then applied to a well in the Bohai oilfield, and the measured data is processed using the proposed method. A comparison of open-hole and cased-hole density in a 48-m well interval shows that the error of each section increases as the well condition deteriorates. The maximum RMSE is 0.027 g/cm³ in the unstable borehole section (>13.6), while minimum RMSE is 0.016 g/cm³ in the section with good cementing quality and stable borehole. The mean relative error of the entire section is 0.91% and the RMSE of cased-hole density measurement is 0.0245 g/cm³. These results demonstrate the feasibility of the proposed method in the field by effectively correcting casing and cement impacts.
An insight into the microorganism growth prediction by means of machine learning approaches
2023, Journal of Petroleum Science and Engineering
Citation Excerpt :
Jeanne-Marie Membré studied the growth of Bacillus cereus, Clostridium perfringens, Escherichia, Salmonella, Listeria monocytogenes in different temperature conditions (Membré et al., 2005). In recent years, machine learning methods have shown wide applications in different petroleum topics (Chahar et al., 2022; Dong et al., 2022; Prochnow et al., 2022; Zhao et al., 2022a, 2022b), therefore, using these methods in MEOR projects can lead to interesting results. In the literature, there is no comprehensive study on applications and comparison of different machine learning methods in MEOR.
Microbial enhanced oil recovery (MEOR) is a well-known oil recovery method that is greatly influenced by the growth and metabolism of the microorganisms. Given the complexities and uncertainties associated with identifying the growth mechanism of microorganism, developing an approach to estimate bacterial concentration versus different factors viz. Salinity, temperature and time is still deemed a challenge. Hence, in this study, seven different machine learning methods namely Artificial Neural Network, Support Vector Machine, Decision Tree, K-nearest Neighbors, Ensemble Learning, Random Forest and Adaptive Boosting are utilized to predict bacterial cell concentration. A databank including 110 data points of bacterial cell concentration entailing the incubation time, salinity, temperature and yeast extract has been collected and used for preparation of these models. Graphical and statistical comparisons are used to analyze the performance and accuracy of each integrated model. The retrieved results revealed that the trained ensemble learning model is the most accurate method in estimating the bacterial growth with correlation coefficient and mean squared error of 0.9163 and 0.0542 on the tested dataset, respectively. Moreover, the KNN model with correlation coefficient and mean squared error of 0.6111 and 0.1192, respectively, is the worst model among the seven estimators. This model has great accuracy in training phase while it is not accurate in validation and testing phase. Due to this fact, it can be concluded that KNN model suffers from overfitting problem. In addition, the impacts of incubation time, yeast extract, temperature and salinity on bacterial cell concentration are also ascertained using sensitivity analysis. It is discerned that the temperature and yeast extract are the most and least effective factors on growth of microorganism, respectively.
Method Study on Formation Density Deriviation Based on Pulsed Neutron Gamma Tool
2024, SSRN
Machine Learning-Based Qualitative Identification of Four-Phase Fluid in Reservoir
2024, ACS Omega
Source-less density measurement using an adaptive neutron-induced gamma correction method
2023, Nuclear Science and Techniques

View full text

Improving the pulsed neutron-gamma density method with machine learning regression algorithms

Highlights

Abstract

Introduction

Section snippets

Physics and feature selection

Pair production and effects of formation chemical composition changes

NGD tool and data design

Analysis of machine learning regressors in the NGD

Conclusion

CRediT authorship contribution statement

Declaration of competing interest

Acknowledgments

Neural Network.

Neural Network.

Comput. Network.

Expert Syst. Appl.

Appl. Soft Comput.

Appl. Radiat. Isot.

Expert Syst. Appl.

Radiation sources in drilling tools: comprehensive risk analysis in the design, development and operation of LWD tools

Pulsed Neutron Density Measurements: Modeling the Depth of Investigation and Cased-Hole Wellbore Uncertainties

Support Vector Regression, Efficient Learning Machines

An assessment of fundamentals of nuclear-based alternatives to conventional chemical source bulk density measurement

Petrophysics - The SPWLA Journal of Formation Evaluation and Reservoir Description

Radioactive Sources in Petroleum Industry: Applications, Concerns and Alternatives, Asia Pacific Health, Safety, Security and Environment Conference

Random forests

Mach. Learn.

API Design for Machine Learning Software: Experiences from the Scikit-Learn Project

Training v-support vector regression: theory and algorithms

Neural Comput.

LIBSVM: a library for support vector machines

ACM Trans. Intelligent Syst Technol. (TIST)

Xgboost: a scalable tree boosting system

Selection of meta-parameters for support vector regression

Gaussian processes for machine learning

Int. J. Neural Syst.

Estimation of regression parameters using SVM with new methods for meta parameter

Int. J. Data Min. Model. Manag.

Well Logging for Earth Scientists

Sourceless Neutron-Gamma Density (SNGD): A Radioisotope-free Bulk Density Measurement: Physics Principles, Environmental Effects, and Applications

Extremely randomized trees

Mach. Learn.

Sourceless Neutron-Density Porosity Determination: Fit-For-Purpose Formation Evaluation with Significant HS&E Benefits

MCNP6.1.1-Beta Release Notes

Pair-${v} $-SVR: a novel and efficient pairing nu-support vector regression algorithm

IEEE Transact. Neural Networks Learn. Syst.