Abstract

The stock market is affected by economic market, policy, and other factors, and its internal change law is extremely complex. With the rapid development of the stock market and the expansion of the scale of investors, the stock market has produced a large number of transaction data, which makes it more difficult to obtain valuable information. Because deep neural network is good at dealing with the prediction problems with large amount of data and complex nonlinear mapping relationship, this paper proposes an attention-guided deep neural network stock prediction algorithm. This paper synthesizes the daily stock social media text emotion index and stock technology index as the data source and applies them to the long-term and short-term memory neural network (LSTM) model to predict the stock market. The stock emotion index is extracted by constructing a social text classification emotion model of bidirectional long-term and short-term memory neural network (Bi-LSTM) based on attention mechanism and glove word vector representation algorithm. In addition, a dimensionality reduction model based on decision tree (DT) and principal component analysis (PCA) is constructed to reduce the dimensionality of stock technical indicators and extract the main data information. Furthermore, this paper proposes a model based on nasNet for pattern recognition. The recognition results can be used to automatically identify short-term K-line patterns, predict reliable trading signals, and help investors customize short-term high-efficiency investment strategies. The experimental results show that the prediction accuracy of the proposed algorithm can reach 98.6%, which has high application value.

1. Introduction

The stock market is an important part of a country’s economy, which seriously affects the formulation of individual and national investment strategies. Program trading is the development trend of the future stock exchange market [1]. The formulation of automation strategy, as an important part of program trading, directly affects the long-term and short-term investment income [2]. The data-driven stock market forecast provides a more reliable buying and selling signal for the automatic trading strategy, which can maximize the user’s investment income. The stock market is a complex and nonlinear environment, which is affected by many variable factors, mainly including five aspects: (1) economic variables, (2) company-specific variables, (3) factory-specific variables, (4) political variables, and (5) investor psychological variables. How to successfully predict the change trend of the stock market and capture the behavior pattern of the stock market in such a complex stock market environment is regarded as the most meaningful and challenging task [3]. As an indispensable part of individual and national economy, the stock market has been a hot topic for a long time. In the field of data analysis, compared with earlier studies, researchers realize that the stock market is a whole composed of a large number of stocks, and there is a high correlation between stock indexes; at the same time, the latest development of sensor networks and communication technology makes it possible to collect massive stock data, so how to effectively process massive stock data, successfully predict the change trend of stock market, and capture the behavior mode of stock market has become the focus of research [4].

However, the accuracy of stock forecasting is limited by many factors. Because stock data is a random walk financial time series, literature [5] demonstrates that the difficulty of financial time series prediction lies in its high noise. If we use a statistical model to predict financial time series, we must preprocess the data and input it into the model, which will destroy the integrity and authenticity of the data. Secondly, stock data is nonlinear and nonstationary. Literature [6] believes that the nonlinearity and nonstationary of stock data lead to the limited application of traditional multiple regression and linear regression models. Only higher-level models can accurately describe such nonlinear financial time series. Finally, the stock market is affected by many complex and uncertain factors, such as long-term macro policies, short-term market expectations, and some emergencies or international events. Therefore, the deep neural network method based on attention is used to predict the stock in the paper.

2.1. Stock Forecasting Based on Regression Method

Scholars at home and abroad have tried many methods to predict the stock market. Literature [7] used the vector autoregressive (VaR) model, error correction model (ECM), and Kalman filter model (KFM) to predict the UK stock market in 1996. Literature [8] uses the Bayesian vector autoregressive model (BVaR) to predict the portfolio return of some large German companies, but the prediction effect is poor. Based on the stock data of the New York Stock Exchange and Nigeria stock exchange, literature [9] attempts to use the ARIMA model to predict stock prices. The results show that the model has short-term prediction potential [10]. However, because the stock data is a nonlinear and nonstationary financial time series, and there are many and complex factors affecting the stock price, the traditional statistical methods and measurement models need to preprocess the input data, and the amount of data cannot be too much, resulting in the unsatisfactory prediction effect. Literature [11] uses the support vector machine (SVM) with good generalization ability and fast computing ability to predict stock market prices. Literature [12] applies the gray system model to the prediction of China’s stock market and obtains reasonable and reliable results. Literature [13] uses the model integrating genetic algorithm (GA) with high convergence and artificial neural network (ANN) to predict the stock market. In addition to the basic machine learning algorithm, researchers also try to use the new algorithm framework based on machine learning to predict the stock market. Among them, literature [14] uses K-neighbor neural network (Knn-Bp) to predict the stock market. Literature [15] uses a fuzzy model with the characteristics of avoiding empirical subjectivity and selecting objectivity to predict the opening price, closing price, maximum price, and minimum price of the stock market every day. However, early studies only considered some simple stock influencing factors, resulting in relatively low accuracy of stock prediction [16].

2.2. Stock Forecasting Based on Neural Network

Literature [17] studies have found that the accuracy of the neural network model [1820] in predicting nonlinear time series data is much higher than the ARIMA model. Literature [21] compares the prediction effects of the Bayesian estimation and neural network model with different standards. The results show that the prediction effect of the neural network model is better. Literature [22] established the AR model, RBF, and GRNN neural network models to predict the opening price, closing price, highest price, and lowest price of Shanghai stock index and compared with the actual price and analysed the error, demonstrating the effectiveness of the three models, but the AR model is relatively unstable, RBF and GRNN network training speed is very fast, and GRNN shows a better effect [23].

Literature [24] proposed a model integrating optimized bacterial chemotaxis (IBCO) and back propagation (BP) artificial neural network algorithm to effectively predict various stock indexes. However, the early artificial neural network has a large number of parameters [25], which is prone to problems such as fitting and gradient dispersion, so the prediction accuracy is not high. With the introduction of dropout and other structures in neural network, deep neural network has become a hot spot in predicting stock market [26]. Among them, the depth learning models with good performance include convolutional neural network (CNN), classical cyclic neural network (RNN), and cyclic convolutional neural network (RCNN) [27]. Among them, LSTM, a kind of cyclic neural network, can integrate long information and short information well and solve the problems of gradient dispersion and gradient explosion. LSTM is widely used in time series prediction [28]. Literature [29] developed a powerful adaptive online gradient learning algorithm based on LSTM to predict time series with outliers. Literature [30] combines LSTM depth network and basic statistical algorithm to predict multistep time series.

3. Stock Index Prediction Algorithm Based on Attention-Guided Deep Neural Network

3.1. Structure Design for Stock Index Prediction Algorithm

Many studies have pointed out that investors’ emotional indicators and stock technical indicators are positively correlated with the changes of the stock market, so using the emotional indicators extracted from social media and traditional technical indicators to predict the stock market has become a research hot spot [31]. In terms of technology, the excellent performance of in-depth learning in natural language and time series tasks makes it possible to successfully predict the stock market. In this chapter, a system model is constructed to predict the stock price and change trend by using the historical emotional indicators and technical indicators with a period of days, so as to provide reliable prediction results for long-term investors and help investors formulate high-efficiency investment strategies. The system model is shown in Figure 1.

As can be seen from Figure 1, this paper constructs a Bi-LSTM [32] text classification model based on attention and glove. The trained model is used to predict the emotional classification of stock-related social media texts in real time and extract emotional indicators from the prediction results. This chapter also constructs a dimensionality reduction model [33] based on decision tree and principal component analysis to reduce the dimensionality of stock technical indicators. Finally, the index is applied to the LSTM model to predict the price and change trend of the stock market. Use the historical emotion index and technical index of the stock with a period of days to predict the stock price and change trend, so as to provide reliable prediction results for long-term investors.

3.2. Structure Design of Social Media Text Emotion Classification Model

Text emotion classification model based on two-way long-term and short-term memory neural network based on focus mechanism and glove. The model structure is shown in Figure 2. The classification model is composed of one globe layer, two Bi-LSTM layers, and one attention layer.

For a stock-related social media text sequence , because the LSTM structure is introduced into the classification model, it is necessary to truncate the sequence into a fixed length , where is the concept of “memory length” in the LSTM structure. As can be seen from Figure 2, the text sequence , as the input of the glove layer, is output as a vector matrix representing words. The experimental results show that the word vector classification result of 200 dimensions is the most accurate. Therefore, the vector dimension of a word is selected as 200 dimensions in this paper. As the first layer of the model, the glove algorithm combines the advantages of latent semantic analysis algorithm (LSA) and continuous word bag algorithm (CBOW). It has faster training speed and better scalability for large-scale corpus algorithm. The loss function of glove algorithm can be defined as

where represents the number of occurrences of word in the context of word , which is a weight function used to weigh the influence between two words, where and represent word and word , respectively. and are the offset term.

The trained stock-related social media emotion classification model and the predicted emotion categories will provide the stock emotion index; it is

where and mean the number of texts in which investors have positive and negative attitudes towards a stock.

3.3. Method Design of Bi-LSTM

The output of the glove layer will be used as the input of the Bi-LSTM layer. As the core structure of Bi-LSTM, LSTM is a kind of cyclic neural network structure. Classical recurrent neural network in “memory” super.

As can be seen from Figure 3, LSTM is composed of a group of connection blocks called “memory blocks,” which can be regarded as a differentiated version of digital computer memory. LSTM is mainly composed of forgetting gate, input gate, output gate, and storage unit.

where and are the intermediate functions of forgetting gate and input gate, respectively, and and are the output functions of the state gate and the output gate, respectively. is the output function of LSTM neurons. In addition, , , , and represent the weight parameters of the network, respectively.

3.4. Attention Layer Design for Stock Prediction

As the fourth layer of the classification model, the attention mechanism is introduced in this chapter. Natural language processing operates by “reading” a complete sentence and compressing the sentence information into a fixed-length vector. It is conceivable that information loss, incomplete transformation, and other problems will occur in a sentence compression near a low dimensional vector composed of hundreds of words, and the attention mechanism solves these problems to a certain extent. It allows the machine to traverse the whole sentence information and then produce reasonable results according to the current word and the whole sentence. The attention layer is designed as in Figure 4.

The attention mechanism first calculates . and represent the target state and source state, respectively. In this classification model, the source state SH is the output of the bidirectional LSTM layer in step , and is all the states generated by the bidirectional LSTM. The score function is designed to balance the influence of weights.

After the data passes through the attention layer and passes through a simple softmax layer, the emotion types of social media texts can be predicted. So far, a complete emotion classification model of stock-related social media texts has been constructed. Through the training of cost function and back propagation algorithm on the training set, a trained emotion classification model can classify social media texts in real time.

4. Short-Term Stock Prediction Model Based on nasNet

4.1. Structure Design of nasNet for Stock Prediction

After years of research, stock researchers have pointed out that the short-term minute K-line pattern chart can provide useful trading signals and help investors formulate efficient investment strategies. However, the investment strategies based on K-line mode now need to be captured by manual observation or hard coding. The former consumes manpower, and the latter has limited range setting and cannot capture the K-line mode diagram flexibly. Therefore, a method that can automatically and comprehensively capture the K-line mode diagram is needed, and an automatic and high-benefit strategy is provided according to the captured results.

The image processing model based on convolutional neural network can get good results in high-speed calculation on large-scale training set, but the amount of calculation is very large. In order to train a good model on a small training set and get good results when migrating to a large training set, the author of the nasNet model proposes to build an RNN controller to automatically generate two structural units: normal cell and reduction cell. These two units will be used as basic units to build a complete model together with basic convolutional neural network units. Figure 5 shows the structure of the nasNet model for stock prediction. As can be seen from the figure, the network structure of nasNet is only several layers more than that on small dataset cifar10, but it is composed of normal unit, restore unit, and volume layer unit as basic units, which fully shows that nasNet structure can be simply migrated from small dataset to large dataset; the implementation of this structure significantly reduces the amount of calculation on large datasets and reduces the calculation cost.

4.2. Structure of Reduction Cell and Normal Cell

Normal cell and reduction cell are the most important core cells in nesNet. Therefore, the two cell structures need to be described in detail. The structure of normal cell and reduction cell is shown in Figure 6.

In the nasNet structure, so it can be seen in Figure 6 that there are five addition operations in both the normal unit and the restore unit. In addition, the structure does not directly connect unused hidden layers in series. On the contrary, all hidden layers created in the convolution unit are sent to the next layer even if they are currently used.

5. Experiment and Result Analysis

5.1. Experimental Environment and Dataset Source

In order to facilitate the statistics of the experimental results, I selected 10 U.S. stocks in different industries as the experimental objects, including Amazon (AMZN), Apple (AAPL), Facebook (FB), Google (Google), Microsoft (MSFT), Netflix (NFlx), qqq, S&P (spy), Twitter (TWTR), and Tesla (tsla). The social media platform used to extract emotional indicators in this experiment is Twitter. In this experiment, a total of 10 stocks were crawled and 7.3 million tweets were used as the social media data source to extract emotional indicators in 1461 trading days; in order to obtain reliable emotion indicators, this paper constructs a social media text emotion classifier to improve the accuracy of emotion prediction. Therefore, it is necessary to mark some tweets as training sets and test sets. A total of 12670 tweets are marked in this experiment, which are divided into three emotion types. At the same time, the reliability of the extracted emotion indicators will be further guaranteed in the future; you need to ensure that each stock has at least 500 tweets a day.

This experiment uses the first 6650 data as the training set and the remaining 121 data as the test set. The data indicators include seven indicators: opening price, closing price, highest price, lowest price, daily trading volume, rise and fall range, and turnover rate, which correspond to the characteristic dimension at each time in the model. This paper uses all seven indicators of the day to predict tomorrow’s closing price.

Tensorflow is selected as the experimental platform, and the experimental environment is a computer equipped with GTX 1080ti graphics card and 32 G memory. The initial learning rate of the model is 0.0007, with a total of 2000 rounds of learning (one round represents that all samples in the training set participate in training once). Firstly, the data are preprocessed, the mean value under each feature dimension is subtracted and then divided by the standard deviation to obtain the standardized data, and then, the data are input into the model for training.

5.2. Verification of Emotion Index Classification Results

In this chapter, 12670 stocks are selected for emotion classification and labeling. In order to maintain the balance of data and the adjustability of model parameters, the number of texts in each category is equivalent, including 4753 positive texts, 4703 negative texts, and 4215 neutral texts, respectively; in addition, the experiment shows that when the truncation length of each text is 20, the classification result is more accurate. Therefore, in this part of the experiment, the truncation length of the text is set to 20. In the experiment, 70% was selected as the training set and 30% as the test set; at the same time, in order to evaluate the performance of the social media text emotion classification model, accuracy, recall, precision, and value are selected as evaluation measures. In order to verify the effectiveness of the social media text emotion classification model, Bi-LSTM, CNN+Bi-LSTM, and Glove+Bi-LSTM models are constructed as control. The experimental results are shown in Table 1.

As can be seen from Table 1, compared with other methods, the social media text emotion classification model constructed in this chapter performs best in the test set (accuracy 0.7659, recall 0.7282, value 0.7663, and accuracy 0.75). Compared with Bi-LSTM and Glove+Bi-LSTM models, the social media text emotion classification model constructed in this paper fully illustrates the role of glove algorithm and attention mechanism in natural language classification task. The accuracy of the CNN+Bi-LSTM model is 7% lower than that of the constructed classification, mainly because the social media text is relatively short and colloquial, while CNN structure is good at dealing with long text structure.

In order to intuitively display the classification results of three emotional texts (positive text, negative text, and neutral text), I define neutral text as 0, positive text as positive, and negative text as negative. In this way, the classification results of emotional text based on attention-Glove-Bi-LSTM are shown in Figure 7. Figures 7(a) and 7(b), respectively, show the classification results, Figure 7(c) shows the classified heat energy, and Figure 7(d) shows the clustering probability. As can be seen from the figure, the above algorithm can cluster three types of emotional texts, indicating the effectiveness of the clustering algorithm. Therefore, this paper constructs a social media text emotion classification instrument, which has a high accuracy of emotion prediction. As shown in the figure, the positive, negative, and neutral classification and clustering have obvious boundaries, which shows that the emotional text classification algorithm proposed in this paper has high classification accuracy.

5.3. Relationship between CNN Model Performance and nasNet Structure

This section explores the impact of these important structural parameters on the performance of the model by adjusting the structural parameters of the nasNet model. Firstly, the convolution kernel in Figure 5 is replaced by the square convolution kernel often used in the image task, and 2 are used, respectively, , , , and . The prediction curve of the model is shown in Figure 8. As you can see, of 2 performs the worst, and the average error is as high as 1.6357%, which is much higher than the other three convolution kernels. This is because it is more difficult to extract the global information of convolution check data with too small size, especially the stock data with significant global correlation. The convolution kernel of 5 shows the best effect, which may be because the larger convolution kernel can more richly extract the correlation information between the feature dimensions. However, the performance of all the above models cannot be compared with the optimized CNN model, which proves the effectiveness of the convolution kernel optimization method for data form.

As can be seen in Figure 8, with the increase of convolution blocks, the model performance shows a downward trend. There are two main reasons: on the one hand, the stock data structure is relatively simple, and the complexity is far less than that of image data. Therefore, using a relatively simple network structure can approach the upper limit of the model, and a more complex network will only lead to serious overfitting. On the other hand, CNN is not suitable for processing time series tasks. The results show that the convolution kernel of has strong prediction ability and prediction accuracy.

As shown in Figure 8, the short-term K-line pattern recognition model based on nasNet is used to automatically identify four common short-term K-line patterns. The experimental results show that the model can well identify four common short-term K-line patterns, and the accuracy is 98.6%. This result can not only free investors from the heavy observation of stock market changes and help investors customize automatic high-efficiency investment strategies but also provide a reliable trading signal for the construction of automatic trading strategies in the future.

5.4. Accuracy Verification of Stock Forecast

Stock market price forecasting is very important for investors to make strategies. Therefore, in this part of the experiment, I mainly study the prediction of emotional indicators and technical indicators on the closing price of the stock market, and the prediction of nesNet on the time series of stock data. In this part of the experiment, a total of 6 emotional indicators and 82 technical indicators are introduced, a total of 88 indicator characteristics are used to predict the changes of the stock market. In order to improve the prediction accuracy, the heat map method is used to select indicators with a correlation coefficient of no less than 0.9 with the closing price to predict the closing price of the stock, and a total of 22 indicator characteristics are selected. These 22 indicators will be used as inputs to the nesNet structure to predict the closing price. In order to evaluate the experimental results of this part, the statistical method root mean square error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), mean square error (MSE), correlation coefficient (), and nonlinear regression multiple correlation coefficient () are selected as the evaluation indexes. The formulas of these indexes are shown in Table 2, where represents the real value of the closing price, represents the predicted value of the closing price, represents the average value of the real price in the test set, and represents the number of test sets. The model based on nasNet is used to identify the above K-line pattern. The loss value of the specific training process on the test set is shown in Figure 9.

It can be clearly seen from Figure 9 that using the model parameters trained in advance to train the nasNet-based image recognition model can quickly achieve the convergence effect, and the loss values of the training set and the test set are very low. In order to evaluate the K-line pattern recognition ability of the nasNet model on small-scale datasets, the accuracy is selected as the evaluation standard. At the same time, in order to verify the role of the model, the experiment selects the latest recognition algorithm in the image field in recent years, in order to verify the advantages of nasNet in small dataset image recognition.

In order to verify the effect of the improved nesNet model on the prediction of stock closing price with time series characteristics, LSTM model, ridge regression, kneighbors, decision tree, and support vector machine (SVM) algorithm are introduced as the comparison group. In addition, the improved nesNet model needs to fine tune many parameters. In order to ensure the whole experimental process and comparability, the experiment selects the parameters with the best prediction results according to the experimental results and fixes these settable parameters as constant values. The output is set to 128 units, and the disconnection degree is set to 0.2. For the models of the control group and the experimental group, the 1461 day trading day data from January 1, 2014, to December 31, 2017, are used as the original data, and 70% are set as the training group and 30% as the control group. The experimental results are shown in Table 2.

In the stock closing price prediction experiment, the model proposed in this paper performs best (RMSE: 0.35, MAE: 0.20, MAPE: 1.13, : 0.39, : 0.20, and MSE: 0.13). As can be seen, the accuracy of the improved nesNet model in predicting the closing price is much higher than that of the control group algorithm, which fully illustrates the excellent performance of the improved nesNet model in the time series prediction task. In future research, the improved nesNet model can also be used to predict more stock market trends.

6. Conclusion

In the field of stock forecasting, the research on data-driven stock forecasting is of great significance to the stock market and the development of automation in the future. Taking social media text and stock technical indicators as data sources, this paper constructs a two-way long and short memory model based on attention mechanism and glove word representation and a dimensionality reduction model based on principal component analysis and decision tree fusion to extract stock emotion indicators from social media text and reduce the dimensionality of stock technical indicators. The extracted stock sentiment index/dimensionality reduction technical index and K-line chart are used as the input of the improved nesNet model to predict the long-term stock market price and stock change trend, respectively. The experimental results show that the prediction method of stock market price and stock change trend proposed in this paper can effectively improve the prediction accuracy. In addition, compared with other methods, it further verifies the superiority of the improved nesNet model considering emotional text. However, the stock market is a complex and nonlinear environment. There is still much room for development in terms of prediction accuracy and strategy formulation, so further research is needed in the future.

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

The author does not have any possible conflicts of interest.