Abstract
Identifying lithology from well logs is an important step in deep prospecting and resource estimation. Various machine learning algorithms have been adopted to identify lithology in oil and gas fields. Such algorithms, however, are rarely used for mineral deposits because of their complex geological conditions. In this paper, we propose an application framework using the gradient boosting decision tree (GBDT) algorithm to identify lithology from well logs in a mineral deposit. The GBDT classifier was built via the procedure of grid search and cross-validation to optimize the hyperparameters. In the Zhaoxian gold deposit, as the study area, an optimized GBDT classifier was built to fit the association between a set of well logs and ten lithological classes. The results demonstrate that the GBDT classifier has good classification performance in lithology identification, with a precision of 93.55%, a recall of 93.49% and an F1-score of 93.27%. The GBDT classification results also indicate that the major features contributing to the lithology classification are resistivity, followed by spontaneous potential and natural gamma according to the model interpretation of feature importance and partial dependence plots. The study demonstrates that the GBDT model can enhance our understanding of lithology identification from well logs in mineral deposits, which provides significant implications for further exploration targeting the deep-seated parts of mineral deposits.
Similar content being viewed by others
Code Availability
The relevant codes are written in Python. To access the source file of relevant data and codes, one can visit the repository on GitHub (https://github.com/orrange/Lithology-classification).
References
Al-Anazi, A., & Gates, I. D. (2010). On the capability of support vector machines to classify lithology from well logs. Natural Resources Research, 19(2), 125–139.
Asante-Okyere, S., Shen, C., Ziggah, Y. Y., Rulegeya, M. M., & Zhu, X. (2019). A novel hybrid technique of integrating gradient-boosted machine and clustering algorithms for lithology classification. Natural Resources Research, 29(4), 2257–2273.
Chang, H., Kopaska-Merkel, D. C., Chen, H., & Durrans, S. R. (2000). Lithofacies identification using multiple adaptive resonance theory neural networks and group decision expert system. Computers and Geosciences, 26(5), 591–601.
Dong, S., Wang, Z., & Zeng, L. (2016). Lithology identification using kernel fisher discriminant analysis with well logs. Journal of Petroleum Science and Engineering, 143, 95–102.
Deng, C., Pan, H., & Luo, M. (2017). Joint inversion of geochemical data and geophysical logs for lithology identification in ccsd main hole. Pure and Applied Geophysics, 174(12), 4407–4420.
Dev, V. A., & Eden, M. R. (2018). Evaluating the boosting approach to machine learning for formation lithology classification. Computer Aided Chemical Engineering, 44, 1465–1470.
Dev, V. A., & Eden, M. R. (2019). Formation lithology classification using scalable gradient boosted decision trees. Computers and Chemical Engineering, 128, 392–404.
Elith, J., Leathwick, J. R., & Hastie, T. (2008). A working guide to boosted regression trees. Journal of Animal Ecology, 77(4), 802–813.
Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5), 1189–1232.
Friedman, J. H. (2002). Stochastic gradient boosting. Computational Statistics and Data Analysis, 38(4), 367–378.
Fushiki, T. (2011). Estimation of prediction error by using K-fold cross-validation. Statistics and Computing, 21(2), 137–146.
Garrouch, A. A., Alsafran, E. M., & Garrouch, K. F. (2009). A classification model for rock typing using dielectric permittivity and petrophysical data. Journal of Geophysics and Engineering, 3, 311–323.
Gu, Y., Bao, Z., Song, X., Patil, S., & Ling, K. (2019). Complex lithology prediction using probabilistic neural network improved by continuous restricted Boltzmann machine and particle swarm optimization. Journal of Petroleum Science and Engineering, 179, 966–978.
Ghawi, R., & Pfeffer, J. (2019). Efficient hyperparameter tuning with grid search for text categorization using knn approach with bm25 similarity. Open Computer Science, 9, 160–180.
Hastie, T., Tibshirani, R., & Friedman, J. (2009). Elements of statistical learning: Data mining, inference and prediction (2nd ed.). Springer.
Han, Q., Zhang, X., & Shen, W. (2018). Lithology identification technology based on gradient boosting decision tree (GBDT) algorithm. Bulletin of Mineralogy Petrology and Geochemistry, 37(06), 175–182.
Hang, Li. (2012). Statistical learning method (p. 24). Tsinghua University Press.
Huan, J., Li, H., Li, M., & Chen, B. (2020). Prediction of dissolved oxygen in aquaculture based on gradient boosting decision tree and long short-term memory network: A study of chang Zhou fishery demonstration base, China. Computers and Electronics in Agriculture, 175, 105530.
Khatchikian, A. (1983). Log evaluation of oil-bearing igneous rocks. World Oil, 197(7), 7–9.
Li, X., & Li, H. (2013). A new method of identification of complex lithologies and reservoirs: Task-driven data mining. Journal of Petroleum Science and Engineering, 109, 241–249.
Li, Z., Kang, Y., Feng, D., Wang, X. M., & Zheng, W. X. (2020). Semi-supervised learning for lithology identification using Laplacian support vector machine. Journal of Petroleum Science and Engineering, 195, 107510.
Liu, W., Fan, H., & Xia, M. (2021). Step-wise multi-grained augmented gradient boosting decision trees for credit scoring. Engineering Applications of Artificial Intelligence, 97, 104036.
Mckinney, W. (2017). Python for data analysis: Data wrangling with Pandas, NumPy, and IPython. China Machine Press.
Martins, S. C., Goliatt, D. F. L., Egberto, P., & Costa, D. O. L. (2018). Machine learning approaches for petrographic classification of carbonate-siliciclastic rocks using well logs and textural information. Journal of Applied Geophysics, 155, 217–255.
Nasyrov, N., Komarov, M., Tartynskikh, P., & Gorlushkina, N. (2020). Automated formatting verification technique of paperwork based on the gradient boosting on decision trees. Procedia Computer Science, 178, 365–374.
Qu, X., Zhang, L., Feng, H., Wang, H., Zhang, T., & Feng, J. (2016). Lithology identification for imbalanced logging data on complex reservoirs. Progress in Geophysics, 31(5), 2128–2132.
Rodriguez, J. D., Perez, A., & Lozano, J. A. (2010). Sensitivity analysis of k-fold cross validation in prediction error estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(3), 569–575.
Ren, X., Hou, J., Song, S., Liu, Y., Chen, D., Wang, X., et al. (2019). Lithology identification using well logs: A method by integrating artificial neural networks and sedimentary patterns. Journal of Petroleum Science and Engineering, 182, 1–15.
Rao, H., Shi, X., Rodrigue, A. K., Feng, J., Xia, Y., Elhoseny, M., et al. (2019). Feature selection based on artificial bee colony and gradient boosting decision tree. Applied Soft Computing, 74, 634–642.
Shao, Y., Chen, Q., & Zhang, D. (2008). The application of improved BP neural network algorithm in lithology recognition. International Symposium on Intelligence Computation and Applications, 10, 342–349.
Salim, A. M. A., Pan, H. P., Luo, M., & Zhou, F. (2008). Integrated log interpretation in the Chinese continental scientific drilling main hole (Eastern China): Lithology and mineralization. Journal of Applied Sciences, 8, 3593–3602.
Sun, J., Zhou, K., Ran, X., & Li, B. (2009). Bayes discriminant analysis method in lithology recognition. Journal of Oil and Gas Technology, 31(2), 74–77.
Swami, A., & Jain, R. (2012). Scikit-learn: Machine learning in python. Journal of Machine Learning Research, 12(10), 2825–2830.
Salehi, S. M., & Honarvar, B. (2014). Automatic identification of formation lithology from well log data: A machine learning approach. Journal of Petroleum Science Research, 3(2), 73–82.
Sebtosheikh, M. A., Motafakkerfard, R., Riahi, M. A., Moradi, S., & Sabety, N. (2015). Support vector machine method, a new technique for lithology prediction in an Iranian heterogeneous carbonate reservoir using petrophysical well logs. Carbonates and Evaporites, 30(1), 59–68.
Sebtosheikh, M. A., & Salehi, A. (2015). Lithology prediction by support vector classifiers using inverted seismic attributes data and petrophysical logs as a new approach and investigation of training data set size effect on its performance in a heterogeneous carbonate reservoir. Journal of Petroleum Science and Engineering, 134, 143–149.
Sun, J., Li, Q., Chen, M., Ren, L., Huang, G., Li, C., et al. (2019). Optimization of models for a rapid identification of lithology while drilling-a win-win strategy based on machine learning. Journal of Petroleum Science and Engineering, 176, 321–341.
Tian, Y., Xu, H., Zhang, X., Wang, H., Guo, T., Zhang, L., et al. (2016). Multi-resolution graph-based clustering analysis for lithofacies identification from well log data: Case study of intraplatform bank gas fields, amu darya basin. Applied Geophysics, 13(4), 598–607.
Tian, Z., Xiao, J., Feng, H., & Wei, Y. (2020). Credit risk assessment based on gradient boosting decision tree. Procedia Computer Science, 174, 150–160.
Xie, Y., Zhu, C., Zhou, W., Li, Z., & Tu, M. (2017). Evaluation of machine learning methods for formation lithology identification: A comparison of tuning processes and model performances. Journal of Petroleum Science and Engineering, 160, 182–193.
Xiang, M., Qin, P., & Zhang, F. (2020). Research and application of logging lithology identification for igneous reservoirs based on deep learning. Journal of Applied Geophysics, 173, 1–8.
Yang, L. Q., Deng, J., Wang, Z. L., Guo, L. N., & Zhao, H. (2016). Relationships between gold and pyrite at the xincheng gold deposit, jiaodong peninsula, China: Implications for gold source and deposition in a brittle epizonal environment. Economic Geology, 111(1), 105–126.
Yao, L., Fang, Z., Xiao, Y., Hou, J., & Fu, Z. (2021). An intelligent fault diagnosis method for lithium battery systems based on grid search support vector machine. Energy, 214, 118866.
Zhang, H., Yang, S., Guo, L., Zhao, Y., Shao, F., & Chen, F. (2015). Comparisons of isomir patterns and classification performance using the rank-based manova and 10-fold cross-validation. Gene, 569(1), 21–26.
Zhang, J., Liu, S., Li, J., Liu, L., Liu, H., & Sun, Z. (2017). Identification of sedimentary facies with well logs: An indirect approach with multinomial logistic regression and artificial neural network. Arabian Journal of Geosciences, 10(11), 1–9.
Zhu, L., Li, H., Yang, Z., Li, C., & Ao, Y. (2018a). Intelligent logging lithological interpretation with convolution neural networks. Petrophysics, 59(6), 799–810.
Zhu, D., Zhang, W., Wang, Y., Tian, J., Liu, H., Hou, J., et al. (2018b). Characteristics of ore bodies and prospecting potential of zhaoxian gold deposit in Laizhou City of Shandong province. Shandong Land and Resources, 34(9), 14–19.
Zhao, S., Zhou, J., & Yang, G. (2019). Averaging estimators for discrete choice by m-fold cross-validation. Economics Letters, 174, 65–69.
Zhou, S., Wang, S., Wu, Q., Azim, R., & Li, W. (2020). Predicting potential mirna-disease associations by combining gradient boosting decision tree with logistic regression. Computational Biology and Chemistry, 85, 107200.
Zhang, Y., Zhang, R., Ma, Q., Wang, Y., Wang, Q., & Huang, Z. (2020). A feature selection and multi-model fusion-based approach of predicting air quality. ISA Transactions, 100, 210–220.
Acknowledgements
The work in the paper was supported by the National Key Research and Development Program of China (Grant No. 2019YFC1805905), the National Natural Science Foundation of China (41872249, 41772349, 41972309) and the Fundamental Research Funds for the Central Universities of Central South University (2020zzts653).
Author information
Authors and Affiliations
Corresponding authors
Rights and permissions
About this article
Cite this article
Zou, Y., Chen, Y. & Deng, H. Gradient Boosting Decision Tree for Lithology Identification with Well Logs: A Case Study of Zhaoxian Gold Deposit, Shandong Peninsula, China. Nat Resour Res 30, 3197–3217 (2021). https://doi.org/10.1007/s11053-021-09894-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11053-021-09894-6