Abstract

Stock price data have the characteristics of time series. At the same time, based on machine learning long short-term memory (LSTM) which has the advantages of analyzing relationships among time series data through its memory function, we propose a forecasting method of stock price based on CNN-LSTM. In the meanwhile, we use MLP, CNN, RNN, LSTM, CNN-RNN, and other forecasting models to predict the stock price one by one. Moreover, the forecasting results of these models are analyzed and compared. The data utilized in this research concern the daily stock prices from July 1, 1991, to August 31, 2020, including 7127 trading days. In terms of historical data, we choose eight features, including opening price, highest price, lowest price, closing price, volume, turnover, ups and downs, and change. Firstly, we adopt CNN to efficiently extract features from the data, which are the items of the previous 10 days. And then, we adopt LSTM to predict the stock price with the extracted feature data. According to the experimental results, the CNN-LSTM can provide a reliable stock price forecasting with the highest prediction accuracy. This forecasting method not only provides a new research idea for stock price forecasting but also provides practical experience for scholars to study financial time series data.

1. Introduction

The change trend of the stock price has always been identified as a very important problem in the economic field [1]. Stock prices are affected by various internal and external factors, such as domestic and foreign economic environment, international situation, industry prospect, financial data of listed companies, and stock market operation. Thus, the forecasting method also has different emphasis [2, 3].

The traditional analysis method is based on economics and finance, which mainly uses the fundamental analysis method and the technical analysis method. On the one hand, the fundamental analysis method pays more attention to the intrinsic value of stocks and qualitatively analyzes the external factors that affect the stock, such as interest rate, exchange rate, inflation, industrial policy, finance of listed companies, international relations, and other economic and political factors. On the other hand, the technical analysis method mainly focuses on the direction of stock price, trading volume, and investors’ psychological expectation, which primarily focuses on analyzing the stock index trajectory of individual stocks or the whole market by using K-line chart and other tools. At present, traditional fundamental analysis and technical analysis are still the most commonly employed methods for many organizations and individual investors [4, 5].

The accuracy of the traditional fundamental analysis method is difficult to be convincing. The reason is not only that the influencing factors are in a long-term cycle, but also the forecasting results are more dependent on the professional quality of analysts. As a financial time series, stock data have the characteristics of random walk [6]. Based on statistics and probability theory, some scholars use time series linear forecasting model to predict the short-term stock price with a large number of long-term data, such as vector autoregression (VAR) [7], Bayesian vector autoregression (BVAR) model [8], autoregressive integrated moving average mode (ARIMA) [9], and generalized autoregressive conditional heteroskedasticity model (GARCH) [10]. However, the accuracy of using time series model alone is questioned due to the uncertainty and high noise characteristics of financial time series and the relationship between independent variables and dependent variables is prone to dynamic changes over time, which limits its further application and expansion [11].

It has certain limitations to predict stock price trend with single simply using the linear time series forecasting model or neural network model. At present, combining the advantages of various methods and using various best algorithms to improve the hybrid method is the development trend of financial time series deep learning [12]. Therefore, in order to make the best of the time series characteristics of data series, deeply mine the data features, and improve the accuracy of stock price forecasting, this paper proposes a stock price forecasting method based on CNN-LSTM for the stock closing price of the next day forecasting. Combining the advantages of convolutional neural networks (CNN) that can extract effective features from the data, and long short-term memory (LSTM) which can not only find the interdependence of data in time series data, but also automatically detect the best mode suitable for relevant data, this method can effectively improve the accuracy of stock price forecasting. The CNN-LSTM model uses CNN to extract the features of the input time data and uses LSTM to predict the stock closing price on the next day. In order to verify the effectiveness of the model, this paper uses the daily transaction data of 7127 trading days from July 1, 1991, to August 31, 2020, in which the first 6627 trading days data are the training set and the last 500 trading days data are the test set.

At present, the financial market is a noisy, nonparametric dynamic system, and there are two main kinds of forecasting methods for stock price: traditional analysis method and machine learning method [13]. The traditional econometric methods or equations with parameters are not suitable for analyzing complex, high-dimensional, and noisy financial series data. In recent years, neural network has become a hot research direction in the field of stock forecasting because it can extract data features from a large number of high-frequency raw data without relying on prior knowledge. In 1988, White used neural network to predict IBM stock, but the experimental results were not good [14]. In 2003, Zhang used neural network and autoregressive integrated moving average model (ARIMA) to forecast stocks, respectively. The experimental results show that neural network has obvious advantages in nonlinear data forecasting, but the accuracy still needs to be improved [15]. In 2005, Sun et al. proposed a time series forecasting method based on neural network. This method combines the optimal partition algorithm (OPA) and radial basis function (RBF) neural network [16]. In 2014, Adhikari et al. proposed a method combining random walk (RW) and artificial neural network (ANN) to predict four financial time series data, and the results showed that the forecasting accuracy had a certain improvement [17]. In 2018, Zhang et al. proposed the network structure of stock price forecasting based on LM-BP neural network, which improved the traditional BP neural network training algorithm’s shortcomings of slow training speed and low precision [18]. In 2018, the experimental results of Hu et al. show that convolutional neural network can predict time series, and deep learning is more suitable for solving the problem of time series. However, because CNN is more commonly used to solve image recognition and feature extraction, the forecasting accuracy of CNN alone is relatively low [19]. In 2020, Kamalov used MLP, CNN, and LSTM to forecast the stock price of four major US public companies. Experimental results showed that these three methods showed better results compared to similar studies that forecast the direction of price change [20]. In 2020, Xue et al. established a high-precision short-term forecasting model of financial market time series based on LSTM deep neural network and compared with the BP neural network, the traditional RNN, and the improved LSTM deep neural network. The results showed that the LSTM deep neural network has high forecasting accuracy and can effectively predict the time series of the stock market [21].

The main contributions of this paper are as follows:(1)By analyzing the correlation and time series of stock price data, a new deep learning method (CNN-LSTM) is proposed to predict the stock price. In this method, CNN is used to extract the time feature of data, and LSTM is used for data forecasting. It can make full use of the time sequence of stock price data to obtain more reliable forecasting.(2)By comparing the evaluation indexes of CNN-LSTM with multilayer perceptron (MLP), CNN, RNN, LSTM, and CNN-RNN, it is proved that CNN-LSTM has high forecasting accuracy and is more suitable for stock price forecasting.

3. CNN-LSTM

3.1. CNN-LSTM Model

CNN has the characteristic of paying attention to the most obvious features in the line of sight, so it is widely used in feature engineering. LSTM has the characteristic of expanding according to the sequence of time, and it is widely used in time series. According to the characteristics of CNN and LSTM, a stock forecasting model based on CNN-LSTM is established. The model structure diagram is shown in Figure 1, and the main structure is CNN and LSTM, including input layer, one-dimensional convolution layer, pooling layer, LSTM hidden layer, and full connection layer.

3.2. CNN

CNN is a network model proposed by Lecun et al. in 1998 [22]. CNN is a kind of feedforward neural network, which has good performance in image processing and natural language processing [23]. It can be effectively applied to the forecasting of time series. The local perception and weight sharing of CNN can greatly reduce the number of parameters, thus improving the efficiency of model learning [24]. CNN is mainly composed of two parts: convolution layer and pooling layer. Each convolution layer contains a plurality of convolution kernels, and its calculation formula is shown in formula (1). After the convolution operation of the convolution layer, the features of the data are extracted, but the extracted feature dimensions are very high, so in order to solve this problem and reduce the cost of training the network, a pooling layer is added after the convolution layer to reduce the feature dimension:where represents the output value after convolution, tanh is the activation function, is the input vector, is the weight of the convolution kernel, and is the bias of the convolution kernel.

3.3. LSTM

LSTM is a network model proposed by Schmidhuber et al. in 1997 [25]. LSTM is a network model designed to solve the longstanding problems of gradient explosion and gradient disappearance in RNN [26, 27]. It has been widely used in speech recognition, emotional analysis, and text analysis, as it has its own memory and can make relatively accurate forecasting [28, 29]. In recent years, it has also been adopted in the field of stock market forecasting [3032]. There is only one repeating module in a standard RNN, and its internal structure is simple. It is usually a tanh layer. However, four of the LSTM modules are similar to the standard RNN modules, and they operate in a special interactive manner [33, 34]. The LSTM memory cell consists of three parts: the forget gate, the input gate, and the output gate, as shown in Figure 2.

The LSTM calculation process is as follows:(1)The output value of the last moment and the input value of the current time are input into the forget gate, and the output value of the forget gate is obtained after calculation, as shown in the following formula:

where the value range of is (0,1), is the weight of the forget gate, and is the bias of the forget gate, is the input value of the current time, and is the output value of the last moment.(2)The output value of the last time and the input value of the current time are inputted into the input gate, and the output value and candidate cell state of the input gate are obtained after calculation, as shown in the following formulas:where the value range of is (0,1), is the weight of the input gate, is the bias of the input gate, is the weight of the candidate input gate, and is the bias of the candidate input gate.(3)Update the current cell state as follows:where the value range of is (0,1).(4)The output and input are received as input values of the output gate at time t, and the output of the output gate is obtained as follows:where the value range of is (0,1), is the weight of the output gate, and is the bias of the output gate.(5)The output value of LSTM is obtained by calculating the output of the output gate and the state of the cell, as shown in the following formula
3.4. CNN-LSTM Training and Prediction Process

The CNN-LSTM process of training and prediction is shown in Figure 3.

The main steps are as follows:(1)Input data: input the data required for CNN-LSTM training.(2)Data standardization: as there is a large gap in the input data, in order to train the model better, the z-score standardization method is adopted to standardize the input data, as shown in the following formula:

where is the standardized value, is the input data, is the average of the input data, and s is the standard deviation of the input data.(3)Initialize network: initialize the weights and biases of each layer of the CNN-LSTM.(4)CNN layer calculation: the input data are successively passed through the convolution layer and pooling layer in the CNN layer, the feature extraction of the input data is carried out, and the output value is obtained.(5)LSTM layer calculation: the output data of the CNN layer are calculated through the LSTM layer, and the output value is obtained.(6)Output layer calculation: the output value of the LSTM layer is input into the full connection layer to get the output value.(7)Calculation error: the output value calculated by the output layer is compared with the real value of this group of data, and the corresponding error is obtained.(8)To judge whether the end condition is satisfied: the conditions for the end are to complete a predetermined number of cycles, the weight is lower than a certain threshold, and the error rate of the forecasting is lower than a certain threshold. If one of the conditions for the end is met, the training will be completed, update the entire CNN-LSTM network, and go to step 10; otherwise, go to step 9.(9)Error backpropagation: propagate the calculated error in the opposite direction, update the weight and bias of each layer, and go to step 4 to continue to train the network.(10)Save the model: save the trained model for forecasting.(11)Input data: input the input data required for the forecasting.(12)Data standardization: the input data are standardized according to formula (8).(13)Forecasting: input the standardized data into the trained model of CNN-LSTM, and then get the corresponding output value.(14)Data standardized restore: the output value obtained through the model of CNN-LSTM is the standardized value, and the standardized value is restored to the original value. As shown in the following formula (9).where is the standardized restored value, is the output value of the CNN-LSTM, is the standard deviation of the input data, and is the average value of the input data.(15)Output result: output the restored results to complete the forecasting process.

4. Experiments

In order to prove the effectiveness of CNN-LSTM, we compared this method with MLP, CNN, RNN, LSTM, and CNN-RNN using the same training set and test set data under the same operating environment. All the experiments are carried out under the running environment of Intel i7-4700H 2.6 GHz, 12 GBs of RAM, 500 GBs of hard disk and Windows 10. According to the influence factors, including the opening price, highest price, lowest price, closing price, volume, turnover, ups and downs, and change, the next day’s closing price is predicted.

4.1. Data

In this experiment, the Shanghai Composite Index (000001) is selected as the experimental data. The daily trading data of 7127 trading days from July 1, 1991, to August 31, 2020, are obtained from the wind database. Each piece of data contains eight items, namely, opening price, highest price, lowest price, closing price, volume, turnover, ups and downs, and change. Some of the data are shown in Table 1. Take the data of the first 6627 trading days as training set and the data of the last 500 trading days as test set.

4.2. Model Implementation

In order to evaluate the forecasting effect of CNN-LSTM, the mean absolute error (MAE), root mean square error (RMSE), and R-square (R2) are used as the evaluation criteria of the methods.

The MAE calculation formula is as follows:where is the predictive value and is the true value. The smaller the value of MAE, the better the forecasting.

The RMSE calculation formula is as follows: where is the predictive value and is the true value. The smaller the value of RMSE, the better the forecasting.

The R2 calculation formula is as follows: where is the predictive value, is the true value, and is the average value. The value range of R2 is (0,1).

The closer the value of MAE and RMSE to 0, the smaller the error between the predicted value and the real value, the higher the forecasting accuracy. The closer R2 is to 1, the better the fitting degree of the model is.

4.3. Implementation of CNN-LSTM

The parameter setting of the CNN-LSTM for this experiment is shown in Table 2.

According to the parameter setting of CNN-LSTM network, we can know that the specific model is constructed as follows: the input training set data is a three-dimensional data vector (None, 10, 8), in which 10 is the size of the time_step and 8 is the 8 features of the input dimension. First, the data enter the one-dimensional convolution layer to further extract features and obtain a three-dimensional output vector (None, 10, 32), in which 32 is the size of the convolution layer filters. Next, the vector enters the pooling layer, and a three-dimensional output vector (None, 10, 32) is also obtained. And then, the output vector enters the LSTM layer for training, and the output data (None, 64) after training enter another layer of full connection layer to get the output value; 64 is the number of hidden units in the LSTM layer. The specific CNN-LSTM model structure is shown in Figure 4.

5. Results

After using the processed training set data to train MLP, CNN, RNN, LSTM CNN-RNN, and CNN-LSTM, respectively, the model completed by training is used to predict the test set data, and the real value is compared with the predicted value as shown in Figures 510.

In Figures 510, among the six forecasting methods, the broken line fitting degree of real value and predicted value is CNN-LSTM, CNN-RNN, LSTM, CNN, RNN, and MLP. CNN-LSTM has the highest degree of broken line fitting which almost coincides with each other, and MLP has the lowest degree of broken line fitting.

According to the predicted value and real value of each method, the evaluation index of each method can be calculated, and the comparison results of the six methods are shown in Table 3 and Figures 1113.

From Table 3 and Figures 1012, the MAE and RMSE of MLP are the largest and R2 is the smallest, while the MAE and RMSE of CNN-LSTM are the smallest, R2 is the largest, and the closest is 1.

By comparing LSTM with RNN, the MAE and RMSE of LSTM decrease, R2 increases by 0.3%, MAE decreases from 29.916 to 28.712 by 4.0%, and RMSE decreases from 42.957 to 41.003 by 4.5%, so LSTM was better than RNN. However, the error measurement indexes MAE and RMSE of CNN-LSTM are the smallest, and the maximum R2 is close to 1. Compared with LSTM, after CNN layer, MAE and RMSE of CNN-LSTM proposed in this paper are lower than those without CNN layer; R2 has a certain improvement; MAE decreases by 4.0%, from 28.712 to 27.564; RMSE decreases by 3.2%, from 41.003 to 39.688; and R2 increases by 0.2%. It shows that the forecasting performance of LSTM can be effectively improved by extracting data features through CNN.

The results show that the performance of CNN-LSTM is the best among the six methods. In terms of forecasting accuracy, MAE is 27.564 and RMSE is 39.688, which is the smallest among the six forecasting models and has high forecasting accuracy, in terms of forecasting performance, and the R2 of CNN-LSTM is 0.9646, which is improved by 2.2%, 0.6%, 0.5%, and 0.2%, respectively, compared with the other four methods. Therefore, the CNN-LSTM proposed in this paper is superior to the other four comparative models in terms of fitting degree and error value. It can well predict the closing price of the next day and provide a reference for investors’ investment.

6. Conclusions

According to the chronological characteristics of stock price data, this paper proposes a CNN-LSTM to predict the stock closing price of the next day. The method uses opening price, highest price, lowest price, closing price, volume, turnover, ups and downs, and change of the stock data as the input, making full use of the time sequence characteristics of the stock data. CNN is used to extract the features of the input data. LSTM is used to learn the extracted feature data and predict the closing price of the stock the next day. This paper takes the relevant data of the Shanghai Composite Index as an example to verify the experimental results. The experimental results show that the CNN-LSTM has the highest forecasting accuracy and the best performance compared with the MLP, CNN, RNN, LSTM, and CNN-RNN. MAE and RMSE are the smallest of all methods, and R2 is close to 1. CNN-LSTM is suitable for the forecasting of stock prices and can provide a relevant reference for investors to maximize investment returns. CNN-LSTM also provides the proposal of practical experience for people’s research on financial time series data. However, the model still has some shortcomings. For example, it only considers the impact of stock price data on closing prices and fails to integrate emotional factors such as news and national policy into the forecast. Our future research work is mainly to increase the sentiment analysis of stock-related news and national policies, so as to ensure the accuracy of stock forecast.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was funded by the Soft Science Project of Hebei Province, Grant 205576142D, and Humanities and Social Science Research Project of Hebei Education Department, Grant SD201010.