A multiple model framework based on time series clustering for shale gas well pressure prediction

https://doi.org/10.1016/j.jngse.2021.104135Get rights and content

Highlights

  • A specialized multiple model prediction framework is proposed for time-series data of shale gas adjustable wells.

  • A weighted warped K-means clustering (WWKM) algorithm is proposed to construct a series of data subsets for training multiple prediction models.

  • A multiple model prediction method based on sequence information (MMP-SI) is designed considering both the current gas demand and the overall pressure trend.

Abstract

Production performance analysis of shale gas based on dynamic parameters is now playing a more important role as a scientific basis for gas well development plan, and describes the structural features, evolution and predict the future production trends. Owing to multi-gradient of gas demand and high fluctuation of the yield adjustment period, it is difficult to identify whether the change of production parameters of gas well, especially the pressure is caused by natural resource consumption or manual production adjustment. Therefore, time-series prediction for adjustable yield wells is an extraordinarily important and challenging task. In this paper, a prediction framework with a multiple model is proposed. Specifically, the weighted warped K-means clustering (WWKM) algorithm is first presented to partition the dataset into a series of clusters considering the significantly different influence of each variable. Thereafter, a multiple prediction model based on sequence information (MMP-SI) is designed to improve the prediction precision by integrating the overall decreasing trends and local fluctuation features of the dataset. Subsequently, the proposed framework is applied to pressure prediction of real time-series data of three shale gas adjustable yield wells for the Fuling region in China. The experimental results show that the proposed framework provides good prediction precision over other state-of-the-art models in terms of different evaluation criteria. The main benefits of this research study are to better simulate shale gas wells in the future for engineers and academic researchers.

Introduction

Time-series data are obviously observations arranged in a chronological order, which reflect the state change and development tendency of industrial systems or behaviors as time goes on in various fields worldwide (Gao et al., 2020). Nevertheless, it is difficult to construct a suitable mathematical model to describe time-series data due to the characteristics of tendency, seasonality and periodicity. Therefore, time-series data modeling and prediction have become popular topics in the fields of data science and knowledge engineering (Khani and Farag, 2019, Zhou et al., 2019). Currently, most existing time-series prediction technologies can be roughly divided into two categories, statistical-based approach and learning-based approach. The statistical-based approach builds a mechanism prediction model based on a statistical analysis, such as the moving average (MA) model (Gershenfeld and Weigend, 1993), the autoregressive (AR) model (Choong et al., 2009), the autoregressive integrated moving average (ARIMA) model (Khani and Farag, 2019), and the state space model (Min et al., 2018). However, these models have shown limitations such as overreliance on the assumptions of data distribution and stability, the lack of generality when applied to industrial processes, etc. With the rapid growth of data in volume and variability, the learning-based modeling methods are showing promising abilities such as its strong data fitting ability and parallelism without relying on any high-level expert knowledge. Satisfactory results have been achieved by using models such as support vector machines (SVMs) (Huang et al., 2019), fuzzy similarity-based methods (Han et al., 2019b, Vanhoenshoven et al., 2020), artificial neural networks (ANNs) (Sriram et al., 2019), and deep neural networks (DNNs) (Xing and Lv, 2019, Lei et al., 2020).

The industrial time-series data exhibit different properties in terms of the production demand and specific physical environments, which naturally poses a great challenge to the analysis and prediction of time-series data for specific industries. A typical challenging example is the production pressure, usually the casing pressure, of shale gas wells. Shale gas, which is natural gas that can be extracted from shale formations, has become an important world energy source. Note that shale gas is released by the injection of mixed fluid under pressures to fracture the rock. Production performance analysis of shale gas well runs through the whole life cycle, which is the most concerned problem in the field of reservoir engineering, especially when the gas reservoir development is in the middle and late stage. The gas well parameters and predicted reserves can be evaluated by using the actual production data combined with appropriate treatments. This can provide a reliable and objective basis for the development plan of the gas field. However, the complex seepage characteristics of multi-scale and multi flow space of shale gas wells are different from those of conventional gas reservoirs. The established mechanism models cannot consider the unique seepage, adsorption and diffusion phenomena of shale gas as a whole, and it is difficult to accurately describe the characteristics of shale reservoirs. On the one hand, gas well pressure shows a downward trend with the continuous development of shale gas. On the other hand, in practical production, the yield of gas well needs to be adjusted frequently with changing gas demands and production levels, which is called shale gas adjustable production wells (SGAYWs). This will lead to an uncertain fluctuation of gas well pressure. Limited by the existing sensing technologies, only a few kinds of daily production data, i.e. gas yield, water content and casing pressure, can be obtained. Generally, casing pressure as a main factor is used to guide the yield scheduling, while the increase of water content in the casing often leads to the sharp fluctuation of the casing pressure (Zhang et al., 2019b, Geng et al., 2018).

Fig. 1 shows the 471 sets of daily time-series data collected from SGAYW, No.24-1HF in Fuling Prefecture in China from July 1, 2017, to April 20, 2019. From Fig. 1, it appears that the general trend of pressures includes the casing pressure (red line) and the tubing pressure (abandoned because of incomplete records), which always declines with the increase in production time. Further, more detailed results show that the pressures fluctuate under the combined influence of manually planned gas demand (green line) and uncertain water content (blue line).

Compared with other industrial time-series data, the production data of shale gas adjustable wells have the following characteristics:

  • It is hard to determine whether the change in pressure comes from natural resource consumption or artificial production adjustment due to the multi-gradient of shale gas yield and the difficulty of correlation analysis.

  • The adjustment yield period is of high fluctuation, which means the time period has to change constantly when gas production target is manually adjusted.

  • The current production pressure not only conforms to the overall trend of pressure change of gas wells but also correlates with the historical pressure change in the same production yield.

For such highly variable and insufficient industrial time-series data, hybrid prediction modeling has become the latest trend, which integrates multiple prediction techniques based on artificial intelligence, computational intelligence and data sciences to take full advantage of single model to improve the prediction accuracy (MolinRibeiro and Coelhoac, 2020). A hybrid regularized Echo state network combined sparse regression computed the output weights (Xu et al., 2019). Strategies of approximate linear dependency, dynamic adjustment, and coherence criterion were combined with kernel recursive least squares for multivariate chaotic time-series online prediction to reduce computational complexity (Han et al., 2019a). A neural network incorporated the bilinear projection and an attention mechanism was used for financial time-series forecasting (Tran et al., 2019). W. Huang et al. proposed a novel neural network based on polynomial neural networks and fuzzy wavelet neurons to describe high-order nonlinear relations between input and output variables (Huang et al., 2017). Although various predictive approaches can offer relatively accurate results for different applications (Mehana et al., 2021, Tang et al., 2021), they invariably ignore the internal relationship between the general decreasing trend and the local fluctuation information of industrial time-series data (Nguyen-Le et al., 2021). Therefore, it is beneficial to explore new approach to obtain the demanded accuracy and to achieve steady performance in pressure prediction in SGAYWs as an emerging field.

In this paper, we develop a novel multiple model framework to deal with the problem of time-series prediction of SGAYWs. The contributions of this paper can be summarized as follows.

  • A specialized multiple model prediction framework is proposed for time-series data of shale gas adjustable wells.

  • A weighted warped K-means clustering (WWKM) algorithm is proposed to construct a series of data subsets for training multiple prediction models, in which the correlation coefficients between variables are introduced as weighting factors to redefine the variation in the sum of quadratic error.

  • To obtain different time granularity prediction results, a multiple model prediction method based on sequence information (MMP-SI) is designed considering both the current gas demand and the overall pressure trend.

  • As a part of our work, a real-world time-series dataset from Fuling Prefecture in China is collected and used to demonstrate the effectiveness of the proposed modeling method.

The remainder of this paper is structured as follows. In Section 2, a brief introduction to warped K-means, and multiple model approach is given. Section 3 describes in detail a multiple model prediction framework. In particular, the WWKM algorithm and MMP-SI method are our main contributions. Experimental results on a real case study of shale gas pressure prediction are given and discussed in Section 4. Finally, the conclusions are drawn in Section 5.

Section snippets

Preliminaries

A brief introduction of warped K-means (WKM) (Leiva and Vidal, 2013, Xie et al., 2019), and multiple model (MM) approach is given in this section.

Framework overviews

As illustrated in Fig. 2, the proposed framework is divided into these following steps.

Step 1: The historical data stored in the database is selected as the training set, and the abnormal data is selected and eliminated usually according to some data preprocessing methods (Salvador et al., 2016), such as well-known Pauta criterion, and combined with the actual work log. It should be pointed out in our work that the pressure at different time gradients is the input of the prediction model, the

Experiments and discussion

In practical applications, it is difficult for any single evaluation criterion to achieve a comprehensive judgment of the calculation results. In our work, several evaluation criteria are used, including mean absolute error (MAE), root mean square error (RMSE), mean absolute percentage error (MAPE), mean square error (MSE), etc. Generally, the smaller the values of the four evaluation criteria are, the closer the predicted value of the model is to the actual value, which means that the

Conclusions

Considering the influence of time and gas demand on the change in production pressure, a novel prediction framework is proposed in this paper. First, the WWKM algorithm is presented to divide time-series data into different clusters for modeling. Furthermore, the Elman neural network is used to build the multiple prediction model based on all of the time-series data and a series of clustering data. In the online prediction stage, the samples collected are first classified into a specific

CRediT authorship contribution statement

Jun Yi: Methodology, Writing – original draft. Xuemei Chen: Data curation, Software. Wei Zhou: Conceptualization, Supervision. Yufei Tang: Writing – review & editing. Chaoxu Mu: Validation, Reviewing.

References (34)

  • ChoongM.K. et al.

    Autoregressive-model-based missing value estimation for dna microarray time series data

    IEEE Trans. Inf. Technol. Biomed.

    (2009)
  • DodgeY.

    Spearman rank correlation coefficient

    (2014)
  • DuanF. et al.

    Recognizing the gradual changes in semg characteristics based on incremental learning of wavelet neural network ensemble

    IEEE Trans. Ind. Electron.

    (2017)
  • EbadollahiS. et al.

    Wind turbine torque oscillation reduction using soft switching multiple model predictive control based on the gap metric and kalman filter estimator

    IEEE Trans. Ind. Electron.

    (2018)
  • GaoS. et al.

    Experiences and lessons learned from china’s shale gas development: 2005-2019

    J. Natl. Gas Sci. Eng.

    (2020)
  • GershenfeldN.A. et al.

    The Future of Time Series: Learning and Understanding

    (1993)
  • HanM. et al.

    Multivariate chaotic time series online prediction based on improved kernel recursive least squares algorithm

    IEEE Trans. Cybern.

    (2019)
  • Cited by (1)

    This document is the results of the research project funded by the National Natural Science Foundation of China under grant 51805059, Natural Science Foundation of Chongqing Grant cstc2019jcyj-msxmX0080, and in part by the Science and Technology Research Program of Chongqing Municipal Education Commission under Grant KJQN201801506, KJZD-K202001501 and KJCX2020051.

    View full text