An efficient Long Short-Term Memory model based on Laplacian Eigenmap in artificial neural networks

https://doi.org/10.1016/j.asoc.2020.106218Get rights and content

Abstract

A new algorithm for data prediction based on the Laplacian Eigenmap (LE) is presented. We construct the Long Short-Term Memory model with the application of the LE in artificial neural networks. The new Long Short-Term Memory model based on Laplacian Eigenmap (LE-LSTM) reserves the characteristics of original data using the eigenvectors derived from the Laplacian matrix of the data matrix. LE-LSTM introduces the projection layer embedding data into a lower dimension space so that it improves the efficiency. With the implementation of LE, LE-LSTM provides higher accuracy and less running time on various simulated data sets with characteristics of multivariate, sequential, and time-series. In comparison with previously reported algorithms such as stochastic gradient descent and artificial neural network with three layers, LE-LSTM leads to many more successful runs and learns much faster. The algorithm provides a computationally efficient approach to most of the artificial neural network data sets.

Introduction

Artificial Neural Network (ANN) has attracted an enormous amount of attention in recent years in the artificial intelligence area [1]. Researchers have concentrated on the ANN applications including image processing [2], [3], speech recognition [4], medical diagnosis [5], [6], robots [7], [8], [9], semantic object parsing [10], speech enhancement [11], stock price prediction [12], etc. Comparing to the traditional prediction algorithms, the ANN can fit data with arbitrary shapes and get higher prediction accuracy. However, the ANN has high computationalcost with the increasing numbers of the hidden layers or neurons [13]. Therefore, it is still a challenging area to find a highly accurate and efficient ANN algorithm for various data sets with different features.

A general ANN is constituted by three layers: input layer, hidden layer, and output layer. There is no feedback mechanism when ANN processes the sequence information, which leads to the decline of prediction accuracy. As an in-depth study on neural network, the researchers found that this defect of traditional neural network restricts its long-term development. Recurrent neural network (RNN) [15], a correction to this defect, introduces a feedback mechanism in the hidden layer, which can study the contextual information of sequence data. Recently, RNN has been applied into various research fields, such as sentiment analysis [16], machine translation [17], etc. RNN trains the data using backpropagation through time algorithm [18], therefore, there still exists the defect of high computational complexity. In order to satisfy the real-time requirement of special scenarios, improve the prediction accuracy, reduce the loss, and speed up the convergence, many researchers presented various improved methods to optimize the RNN algorithms. Mohammed et al. [19] formulated the kinematic control problem of Stewart platforms to constrain quadratic programming, and obtained the Karush–Kuhn–Tucker conditions of the problem by considering the dual space, and then, designed a dynamic neural network to solve the optimization problem recurrently. A novel primal–dual neural network was designed by Li et al. [20], this model directly took noise control into account and was able to reach the accurate tracking of reference trajectories in a noisy environment. Xiao et al. proposed two nonlinear recurrent neural networks based on two nonlinear activation functions to solve general time-varying linear matrix equations [21]. Kusupati et al. developed the FastRNN and FastGRNN algorithms to address the twin RNN limitations of inaccurate training and inefficient prediction [22]. Xiao et al. presented a systematic and constructive procedure using zeroing neural dynamics to design control laws based on the efficient solution of dynamic Lyapunov equation [23], and then designed two robust nonlinear zeroing neural networks by adding two novel nonlinear activation functions for finding the solution of the Lyapunov equation in the presence of various noises [24]. Miikkulainen et al. proposed an automated method, CoDeepNEAT, for optimizing deep learning architectures through evolution [25].

Based on RNN, a mainstream deep learning model, Long Short-Term Memory (LSTM) algorithm [26] adds the step of filtering for previous states, which can select the states with the greater influence to the current states. Therefore, LSTM can solve the vanishing gradient problem, which can be explained that the gradient value will become very small and lead to stop learning as time going on. Kuchaiev et al. presented two simple ways of reducing the number of parameters and accelerating the training of large LSTM networks [27]. Karim et al. proposed the augmentation of fully convolutional networks with LSTM and RNN sub-modules for time series classification [28]. Elsaid et al. developed a RNN capable of predicting aircraft engine vibrations using LSTM neurons [29]. Miikkulainen et al. proposed a new method, evolution of a tree-based encoding of the gated memory nodes [30].

Using a traditional approach, the LSTM algorithm usually maps the original data into memory unit according to the shape of data, which will lead to a situation: the mapped data may lose the features or hidden information in original data. Essentially, this defect may result in the inaccuracy of the prediction model. In order to fix this defect, we introduced the idea of Laplacian Eigenmap (LE) [31] to solve the problem of data encapsulation, and proposed a new algorithm Long Short-Term Memory based on Laplacian Eigenmap (LE-LSTM). This algorithm is similar to the standard RNN [15] with a certain number of hidden layers. We replaced each normal node as a memory unit containing one node with a self-connecting loop edge and the fixed weight 1, which indicates the feedback with a decay of 1 time step [26]. This approach uses the LE to transform the original data without losing information hidden in data and can acquire good performance in prediction.

The remainder of this paper is organized as follows. In Section 2, we describe the idea, model, steps, computational complexity, and pseudocode of the LE-LSTM algorithm. We present some evaluation metrics and discuss the numerical results in Section 3. In Section 4, experimental results are discussed and we show that the proposed algorithm has higher training, testing accuracy and lower training, testing loss. The running time is also less than the LSTM. Furthermore, the details of the numerical performance of LE-LSTM and the other competitive algorithms are discussed. We draw a conclusion and give the future works in Section 5.

Overall our paper makes the following contributions:

  • This study presents a new method in data prediction via Long Short-Term Memory Model based on Laplacian Eigenmap.

  • The LE-LSTM reserves the characteristics of original data using eigenvectors derived from the Laplacian matrix of the data matrix.

  • This algorithm provides higher accuracy and less running time on representative data sets with characteristics of multivariate, sequential, and time-series than other state-of-the-art algorithms.

  • This proposed algorithm yields satisfactory performance and computational efficiency.

Section snippets

Idea of the algorithm

The LSTM algorithm, one of the RNN models, is different from feedforward and backpropagation neural networks. LSTM has a defect that the mapped data suffers the loss of features or hidden information in original data. In order to fix the defect, we introduce the idea of LE to optimize the LSTM model. LE-LSTM is similar to ANN and has a certain number of hidden layers. We replaced each normal node as a memory unit containing one node with a self-connecting loop edge and the fixed weight 1, which

Loss function

Loss function, a significant component of neural network, is used to test the difference, a non-negative value, between the predicted values yˆ and expected value y. The robustness of a model will increase as the value of loss function decreases. Loss function is also the core components of empirical risk function and structure risk function. In general, the structure risk function of a model is constituted by empirical risk item and regularization item, which is defined as follows: θ=argminθL(

Comparison of LE-LSTM and LSTM

We have compared the proposed algorithm LE-LSTM with the LSTM algorithm on eight data sets. The numerical experiments of training/testing accuracy, training/testing loss, and running time are shown in Table 2, Table 3, Table 4, Table 5.

According to Fig. 11, it is clear that the proposed algorithm LE-LSTM has higher training/testing accuracy and lower training/testing loss, and has better performance than the original algorithm LSTM on most data sets.

Based on Fig. 12, we show that the proposed

Conclusion

In this paper, we introduced a fast and low cost algorithm in RNN based on the combination of LE and LSTM. Numerical experiments show that the proposed algorithm provides high accuracy and low running time in various data sets. In comparison with the LSTM and other start-of-the-art algorithms, we believe that the LE-LSTM algorithm presented here is able to achieve high accurate prediction with lower cost. With the implementation of the LE algorithm, we are able to process sequential data sets

CRediT authorship contribution statement

F. Hu and Y.H. Zhu conceived of the presented idea. F. Hu and J. Liu developed the theory and performed the computations. Y.H. Zhu and L.H. Li verified the analytical methods. J. Liu encouraged Y.H. Zhu to investigate and supervised the findings of this work. All authors discussed the results and contributed to the final manuscript.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment

We acknowledge the funding support by the Natural Science Foundation of Hubei Province (2018CFB259).

References (47)

  • UmamaheswariJ. et al.

    Improving speech recognition performance using spectral subtraction with artificial neural network

    Int. J. Adv. Stud. Sci. Res.

    (2018)
  • HuF. et al.

    A time simulated annealing-back propagation algorithm and its application in disease prediction

    Modern Phys. Lett. B

    (2018)
  • RahmaniM. et al.

    A novel adaptive neural network integral sliding-mode control of a biped robot using bat algorithm

    J. Vib. Control

    (2018)
  • XiaoL. et al.

    Solving time-varying system of nonlinear equations by finite-time recurrent neural networks with application to motion tracking of robot manipulators

    IEEE Trans. Syst. Man Cybern. A

    (2019)
  • LiangX. et al.

    Semantic object parsing with graph LSTM

  • WeningerF. et al.

    Speech enhancement with LSTM recurrent neural networks and its application to noise-robust ASR

  • SelvinS. et al.

    Stock price prediction using LSTM, RNN and CNN-sliding window model

  • HopfieldJ.J.

    Artificial neural networks

    IEEE Circuits Devices Mag.

    (1988)
  • LiptonZ.C. et al.

    A critical review of recurrent neural networks for sequence learning

    (2015)
  • PalS. et al.

    Sentiment analysis in the light of LSTM recurrent neural networks

    Int. J. Synth. Emot.

    (2018)
  • ZhangJ. et al.

    Towards machine translation in semantic vector space

    ACM Trans. Asian Low-Resource Lang. Inf. Process.

    (2015)
  • MozerM.C.

    A focused backpropagation algorithm for temporal pattern recoginiztion

    Complex Syst.

    (1989)
  • MohammedA.M. et al.

    Dynamic neural networks for kinematic redundancy resolution of parallel stewart platforms

    IEEE Trans. Cybern.

    (2015)
  • Cited by (10)

    View all citing articles on Scopus
    View full text