An efficient Long Short-Term Memory model based on Laplacian Eigenmap in artificial neural networks

doi:10.1016/j.asoc.2020.106218

Applied Soft Computing

Volume 91, June 2020, 106218

https://doi.org/10.1016/j.asoc.2020.106218 Get rights and content

Abstract

A new algorithm for data prediction based on the Laplacian Eigenmap (LE) is presented. We construct the Long Short-Term Memory model with the application of the LE in artificial neural networks. The new Long Short-Term Memory model based on Laplacian Eigenmap (LE-LSTM) reserves the characteristics of original data using the eigenvectors derived from the Laplacian matrix of the data matrix. LE-LSTM introduces the projection layer embedding data into a lower dimension space so that it improves the efficiency. With the implementation of LE, LE-LSTM provides higher accuracy and less running time on various simulated data sets with characteristics of multivariate, sequential, and time-series. In comparison with previously reported algorithms such as stochastic gradient descent and artificial neural network with three layers, LE-LSTM leads to many more successful runs and learns much faster. The algorithm provides a computationally efficient approach to most of the artificial neural network data sets.

Introduction

Artificial Neural Network (ANN) has attracted an enormous amount of attention in recent years in the artificial intelligence area [1]. Researchers have concentrated on the ANN applications including image processing [2], [3], speech recognition [4], medical diagnosis [5], [6], robots [7], [8], [9], semantic object parsing [10], speech enhancement [11], stock price prediction [12], etc. Comparing to the traditional prediction algorithms, the ANN can fit data with arbitrary shapes and get higher prediction accuracy. However, the ANN has high computationalcost with the increasing numbers of the hidden layers or neurons [13]. Therefore, it is still a challenging area to find a highly accurate and efficient ANN algorithm for various data sets with different features.

A general ANN is constituted by three layers: input layer, hidden layer, and output layer. There is no feedback mechanism when ANN processes the sequence information, which leads to the decline of prediction accuracy. As an in-depth study on neural network, the researchers found that this defect of traditional neural network restricts its long-term development. Recurrent neural network (RNN) [15], a correction to this defect, introduces a feedback mechanism in the hidden layer, which can study the contextual information of sequence data. Recently, RNN has been applied into various research fields, such as sentiment analysis [16], machine translation [17], etc. RNN trains the data using backpropagation through time algorithm [18], therefore, there still exists the defect of high computational complexity. In order to satisfy the real-time requirement of special scenarios, improve the prediction accuracy, reduce the loss, and speed up the convergence, many researchers presented various improved methods to optimize the RNN algorithms. Mohammed et al. [19] formulated the kinematic control problem of Stewart platforms to constrain quadratic programming, and obtained the Karush–Kuhn–Tucker conditions of the problem by considering the dual space, and then, designed a dynamic neural network to solve the optimization problem recurrently. A novel primal–dual neural network was designed by Li et al. [20], this model directly took noise control into account and was able to reach the accurate tracking of reference trajectories in a noisy environment. Xiao et al. proposed two nonlinear recurrent neural networks based on two nonlinear activation functions to solve general time-varying linear matrix equations [21]. Kusupati et al. developed the FastRNN and FastGRNN algorithms to address the twin RNN limitations of inaccurate training and inefficient prediction [22]. Xiao et al. presented a systematic and constructive procedure using zeroing neural dynamics to design control laws based on the efficient solution of dynamic Lyapunov equation [23], and then designed two robust nonlinear zeroing neural networks by adding two novel nonlinear activation functions for finding the solution of the Lyapunov equation in the presence of various noises [24]. Miikkulainen et al. proposed an automated method, CoDeepNEAT, for optimizing deep learning architectures through evolution [25].

Based on RNN, a mainstream deep learning model, Long Short-Term Memory (LSTM) algorithm [26] adds the step of filtering for previous states, which can select the states with the greater influence to the current states. Therefore, LSTM can solve the vanishing gradient problem, which can be explained that the gradient value will become very small and lead to stop learning as time going on. Kuchaiev et al. presented two simple ways of reducing the number of parameters and accelerating the training of large LSTM networks [27]. Karim et al. proposed the augmentation of fully convolutional networks with LSTM and RNN sub-modules for time series classification [28]. Elsaid et al. developed a RNN capable of predicting aircraft engine vibrations using LSTM neurons [29]. Miikkulainen et al. proposed a new method, evolution of a tree-based encoding of the gated memory nodes [30].

Using a traditional approach, the LSTM algorithm usually maps the original data into memory unit according to the shape of data, which will lead to a situation: the mapped data may lose the features or hidden information in original data. Essentially, this defect may result in the inaccuracy of the prediction model. In order to fix this defect, we introduced the idea of Laplacian Eigenmap (LE) [31] to solve the problem of data encapsulation, and proposed a new algorithm Long Short-Term Memory based on Laplacian Eigenmap (LE-LSTM). This algorithm is similar to the standard RNN [15] with a certain number of hidden layers. We replaced each normal node as a memory unit containing one node with a self-connecting loop edge and the fixed weight $1$ , which indicates the feedback with a decay of $1$ time step [26]. This approach uses the LE to transform the original data without losing information hidden in data and can acquire good performance in prediction.

The remainder of this paper is organized as follows. In Section 2, we describe the idea, model, steps, computational complexity, and pseudocode of the LE-LSTM algorithm. We present some evaluation metrics and discuss the numerical results in Section 3. In Section 4, experimental results are discussed and we show that the proposed algorithm has higher training, testing accuracy and lower training, testing loss. The running time is also less than the LSTM. Furthermore, the details of the numerical performance of LE-LSTM and the other competitive algorithms are discussed. We draw a conclusion and give the future works in Section 5.

Overall our paper makes the following contributions:

•
This study presents a new method in data prediction via Long Short-Term Memory Model based on Laplacian Eigenmap.
•
The LE-LSTM reserves the characteristics of original data using eigenvectors derived from the Laplacian matrix of the data matrix.
•
This algorithm provides higher accuracy and less running time on representative data sets with characteristics of multivariate, sequential, and time-series than other state-of-the-art algorithms.
•
This proposed algorithm yields satisfactory performance and computational efficiency.

Section snippets

Idea of the algorithm

The LSTM algorithm, one of the RNN models, is different from feedforward and backpropagation neural networks. LSTM has a defect that the mapped data suffers the loss of features or hidden information in original data. In order to fix the defect, we introduce the idea of LE to optimize the LSTM model. LE-LSTM is similar to ANN and has a certain number of hidden layers. We replaced each normal node as a memory unit containing one node with a self-connecting loop edge and the fixed weight $1$ , which

Loss function

Loss function, a significant component of neural network, is used to test the difference, a non-negative value, between the predicted values $\hat{y}$ and expected value $y$ . The robustness of a model will increase as the value of loss function decreases. Loss function is also the core components of empirical risk function and structure risk function. In general, the structure risk function of a model is constituted by empirical risk item and regularization item, which is defined as follows: $θ^{*} = arg m i n_{θ} L ($

Comparison of LE-LSTM and LSTM

We have compared the proposed algorithm LE-LSTM with the LSTM algorithm on eight data sets. The numerical experiments of training/testing accuracy, training/testing loss, and running time are shown in Table 2, Table 3, Table 4, Table 5.

According to Fig. 11, it is clear that the proposed algorithm LE-LSTM has higher training/testing accuracy and lower training/testing loss, and has better performance than the original algorithm LSTM on most data sets.

Based on Fig. 12, we show that the proposed

Conclusion

In this paper, we introduced a fast and low cost algorithm in RNN based on the combination of LE and LSTM. Numerical experiments show that the proposed algorithm provides high accuracy and low running time in various data sets. In comparison with the LSTM and other start-of-the-art algorithms, we believe that the LE-LSTM algorithm presented here is able to achieve high accurate prediction with lower cost. With the implementation of the LE algorithm, we are able to process sequential data sets

CRediT authorship contribution statement

F. Hu and Y.H. Zhu conceived of the presented idea. F. Hu and J. Liu developed the theory and performed the computations. Y.H. Zhu and L.H. Li verified the analytical methods. J. Liu encouraged Y.H. Zhu to investigate and supervised the findings of this work. All authors discussed the results and contributed to the final manuscript.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment

We acknowledge the funding support by the Natural Science Foundation of Hubei Province (2018CFB259).

References (47)

VardhanaM. et al.
Convolutional neural network for bio-medical image segmentation with hardware acceleration
Cogn. Syst. Res.
(2018)
ALzubiJ.A. et al.
Boosted neural network ensemble classification for lung cancer disease diagnosis
Appl. Soft Comput.
(2019)
JinL. et al.
Robot manipulator control using neural networks: A survey
Neurocomputing
(2018)
PollackJ.B.
Recursive distributed representations
Artificial Intelligence
(1990)
XiaoL. et al.
Nonlinear recurrent neural networks for finite-time solution of general time-varying linear matrix equations
Neural Netw.
(2018)
MiikkulainenR. et al.
Evolving deep neural networks
ElSaidA. et al.
Optimizing long short-term memory recurrent neural networks using ant colony optimization to predict turbine engine vibration
Appl. Soft Comput.
(2018)
IbaW. et al.
Trading off simplicity and coverage in incremental concept learning
McCullochW.S. et al.
A logical calculus of the ideas immanent in nervous activity
Bull. Math. Biophys.
(1943)
StollengaM.F. et al.
Parallel multi-dimensional LSTM, with application to fast biomedical volumetric image segmentation

UmamaheswariJ. et al.

Improving speech recognition performance using spectral subtraction with artificial neural network

Int. J. Adv. Stud. Sci. Res.

(2018)

HuF. et al.

A time simulated annealing-back propagation algorithm and its application in disease prediction

Modern Phys. Lett. B

(2018)

RahmaniM. et al.

A novel adaptive neural network integral sliding-mode control of a biped robot using bat algorithm

J. Vib. Control

(2018)

XiaoL. et al.

Solving time-varying system of nonlinear equations by finite-time recurrent neural networks with application to motion tracking of robot manipulators

IEEE Trans. Syst. Man Cybern. A

(2019)

LiangX. et al.

Semantic object parsing with graph LSTM

WeningerF. et al.

Speech enhancement with LSTM recurrent neural networks and its application to noise-robust ASR

SelvinS. et al.

Stock price prediction using LSTM, RNN and CNN-sliding window model

HopfieldJ.J.

Artificial neural networks

IEEE Circuits Devices Mag.

(1988)

LiptonZ.C. et al.

A critical review of recurrent neural networks for sequence learning

(2015)

PalS. et al.

Sentiment analysis in the light of LSTM recurrent neural networks

Int. J. Synth. Emot.

(2018)

ZhangJ. et al.

Towards machine translation in semantic vector space

ACM Trans. Asian Low-Resource Lang. Inf. Process.

(2015)

MozerM.C.

A focused backpropagation algorithm for temporal pattern recoginiztion

Complex Syst.

(1989)

MohammedA.M. et al.

Dynamic neural networks for kinematic redundancy resolution of parallel stewart platforms

IEEE Trans. Cybern.

(2015)

Cited by (10)

Performance of a language identification system using hybrid features and ANN learning algorithms
2021, Applied Acoustics
Citation Excerpt :
The most basic neural network used for training purposes consists of an input unit layer and a single output layer. The classifier used for language identification in this work is FFBPNN [31]. FFBPNN algorithm is utilized for supervised multilayer neural network training [27].
This paper discusses the use of hybrid features and artificial neural network (ANN) for spoken language identification (LID). At the feature extraction stage, we have used RASTA-PLP features and later hybrid features by combining the state-of-the-art MFCC features with RASTA-PLP features. The classifier used at the back-end of the LID system is Feed-forward Back Propagation Neural Network (FFBPNN). Performance comparison is done for all the thirteen learning algorithms available in FFBPNN. Results indicate better performance with MFCC + RASTA-PLP features as compared to RASTA-PLP features when used individually. The classification accuracy of 94.6 percent is obtained with ‘trainlm’ network training function and a test error rate of 0.10 with MFCC + RASTA-PLP hybrid features. On the other hand, RASTA-PLP features provide the best classification accuracy of 89.6%. Also, in this paper two new training functions are proposed. Experimental results show that for the MFCC + RASTA-PLP features, the proposed training function 1 offers 95.1 percent accuracy and 0.0993 mean square error. Similarly, the proposed training function 2 offers 95.3 per cent classification accuracy and 0.093 minimum mean square error for MFCC + RASTA-PLP features. The feasibility of the proposed solution is estimated by simulating multiple experiments on a language database of four languages i.e. Hindi, Tamil, Malayalam and English in the working Platform of MATLAB.
Memristor-based neural networks: a bridge from device to artificial intelligence
2023, Nanoscale Horizons
An Unsupervised, Autonomous, and Objective Method for Validation or Selection of Training Samples
2022, SSRN
A multimode process monitoring strategy via improved variational inference Gaussian mixture model based on locality preserving projections
2022, Transactions of the Institute of Measurement and Control
A bidirectional long short-term memory model algorithm for predicting covid-19 in gulf countries
2021, Life
Comparative analysis for detecting skin cancer using SGD-based optimizer on a CNN versus DCNN architecture and ResNet-50 versus AlexNet on Adam optimizer
2021, Deep Learning for Personalized Healthcare Services

View all citing articles on Scopus

View full text

An efficient Long Short-Term Memory model based on Laplacian Eigenmap in artificial neural networks

Abstract

Introduction

Section snippets

Idea of the algorithm

Loss function

Comparison of LE-LSTM and LSTM

Conclusion

CRediT authorship contribution statement

Declaration of Competing Interest

Acknowledgment

Cogn. Syst. Res.

Appl. Soft Comput.

Neurocomputing

Artificial Intelligence

Neural Netw.

Appl. Soft Comput.

A logical calculus of the ideas immanent in nervous activity

Bull. Math. Biophys.

Parallel multi-dimensional LSTM, with application to fast biomedical volumetric image segmentation

Improving speech recognition performance using spectral subtraction with artificial neural network

Int. J. Adv. Stud. Sci. Res.

A time simulated annealing-back propagation algorithm and its application in disease prediction

Modern Phys. Lett. B

A novel adaptive neural network integral sliding-mode control of a biped robot using bat algorithm

J. Vib. Control

Solving time-varying system of nonlinear equations by finite-time recurrent neural networks with application to motion tracking of robot manipulators

IEEE Trans. Syst. Man Cybern. A

Semantic object parsing with graph LSTM

Speech enhancement with LSTM recurrent neural networks and its application to noise-robust ASR

Stock price prediction using LSTM, RNN and CNN-sliding window model

Artificial neural networks

IEEE Circuits Devices Mag.

A critical review of recurrent neural networks for sequence learning

Sentiment analysis in the light of LSTM recurrent neural networks

Int. J. Synth. Emot.

Towards machine translation in semantic vector space

ACM Trans. Asian Low-Resource Lang. Inf. Process.

A focused backpropagation algorithm for temporal pattern recoginiztion

Complex Syst.

Dynamic neural networks for kinematic redundancy resolution of parallel stewart platforms

IEEE Trans. Cybern.