An optimized model using LSTM network for demand forecasting

https://doi.org/10.1016/j.cie.2020.106435Get rights and content

Highlights

  • A demand forecasting method based on multi-layer LSTM networks is proposed.

  • The proposed method improves the forecasting accuracy.

  • It has strong ability to capture nonlinear patterns in time series data.

  • The empirical results show that the method outperforms other standard techniques.

Abstract

In a business environment with strict competition among firms, accurate demand forecasting is not straightforward. In this paper, a forecasting method is proposed, which has a strong capability of predicting highly fluctuating demand data. Therefore, in this paper we propose a demand forecasting method based on multi-layer LSTM networks. The proposed method automatically selects the best forecasting model by considering different combinations of LSTM hyperparameters for a given time series using the grid search method. It has the ability to capture nonlinear patterns in time series data, while considering the inherent characteristics of non-stationary time series data. The proposed method is compared with some well-known time series forecasting techniques from both statistical and computational intelligence methods using demand data of a furniture company. These methods include autoregressive integrated moving average (ARIMA), exponential smoothing (ETS), artificial neural network (ANN), K-nearest neighbors (KNN), recurrent neural network (RNN), support vector machines (SVM) and single layer LSTM. The experimental results indicate that the proposed method is superior among the tested methods in terms of performance measures.

Introduction

Demand forecasting is the basis of all planning activities (Haberleitner, Meyr, & Taudes, 2010). Specifically, demand prediction as a predictive analytics task is considered an essential tool to gain an understanding of future demand. Accurate demand forecasting guarantees suitable supply chain management, and enhances customer satisfaction by preventing inventory stock-out (Kumar, Shankar, & Alijohani, 2019).

Strict competition among firms in any domain has made it difficult for businesses to accurately forecast the customers’ demands using traditional demand forecasting methods (Guo et al., 2017, Kumar et al., 2019). Therefore, companies are increasingly moving toward the use of advanced data science techniques to forecast customer demand. In general, customer demand is modeled as a sequential data of customer demands over time. Hence, demand forecasting problem can be formulated as a time series forecasting problem (Villegas, Pedregal, & Trapero, 2018). Time series prediction has been applied in various areas of application such as credit scoring (Lin, Hu, & Tsai, 2011), electricity load forecasting (Johannesen et al., 2019, Raza et al., 2017), forecasting call center arriving calls (Taylor, 2008) tourism demand forecasting (Law, Li, Fong, & Han, 2019), ATM cash demand forecasting (Martínez, Frías, Pérez, & Rivera, 2017), forecasting of petroleum production (Sagheer & Kotb, 2019), weather forecasting (Maqsood, Khan, & Abraham, 2004), etc.

Generally, time series forecasting techniques fall into the two main categories of statistical and computational intelligence methods (Khashei & Bijari, 2011). Widely-used statistical time series forecasting methods such as ARIMA suppose that the time series contains only linear components. However, most real-world time series data consist of nonlinear components too. To address forecasting of time series with nonlinear patterns, several nonlinear statistical methods have been developed; for example, the autoregressive conditional heteroscedastic (ARCH) model, general autoregressive conditional heteroscedastic (GARCH) (Khashei & Bijari, 2011). However, there are many variations of these models (Enders, 2008), each suitable at modeling only a specific nonlinearity. This causes the procedure of finding a proper model for time series to become more complex.

Recently, computational intelligence techniques including artificial neural networks (ANN), support vector machine (SVM), K-nearest neighbors (KNN), and adaptive neuro-fuzzy inference system (ANFIS) have been frequently used for the problem of time series prediction.

ANNs have several advantageous characteristics, including universal approximation, being data driven, and the ability to better capture nonlinear patterns in data (Khashei and Bijari, 2011, Panigrahi and Behera, 2017). A specific class of ANNs are recurrent neural networks (RNNs). Unlike in feedforward ANNs, the connections between nodes in an RNN establish a cycle which allows signals to move in different directions (Parmezan, Souza, & Batista, 2019). RNNs provide a short-term memory by storing the activations from each time step. This makes it a suitable technique for processing sequence data (Parmezan et al., 2019). The weakness of RNNs is the vanishing and exploding gradient problem, which makes it hard to train (Bengio et al., 1994, Parmezan et al., 2019). The prevalent solution to overcome this weakness is to use gated architectures such as LSTM (Hochreiter & Schmidhuber, 1997), which can exploit longer-range timing information (Wu et al., 2018, Xin et al., 2018).

Although different types of ANNs can capture nonlinear patterns of time series data, research have indicated that the ANNs with shallow architectures are unable to accurately model time series with a high degree of nonlinearity, longer range and heterogeneous characteristics (Sagheer and Kotb, 2019, Taieb et al., 2012). In addition, it is demonstrated that deep neural network architectures have better generalization than shallow architectures (Hermans and Schrauwen, 2013, Utgoff and Stracuzzi, 2002).

In this study, we propose a method based on a multi-layer LSTM network by using the grid search approach. The proposed method searches for the optimal hyperparameters of the LSTM network. The capability to capture nonlinear patterns in time series data is one of the main advantages of our method. We apply the proposed method on real-world demand data of a furniture company, and compare it to other state-of the-art time series forecasting techniques. The results of these methods are compared, and we demonstrate that the model built by employing the proposed method performs substantially better than the alternatives.

The rest of this paper is organized as follows. Section 2 gives a comprehensive literature review on time series forecasting along with a brief description of methods utilized throughout this paper. In Section 3, we describe the proposed method. Section 4 presents a case study using the proposed method as well as the methods used for comparison. Also, this section analyzes and compares the results of utilized methods. In Section 5, we draw a conclusion and suggest future works.

Section snippets

Related work

In this section, we firstly present a comprehensive literature review on time series forecasting and identify the utilized techniques and the context of each study. Afterward, the utilized forecasting techniques throughout this paper are described.

Time series forecasting has been applied in many areas of application. Table 1 summarizes prevalent research in the context of time series forecasting, that have been published during the past decade. The table highlights contribution, modeling

The proposed methodology

The aim of this study is to obtain an accurate model for demand forecasting of a furniture company. We exploit recent deep learning methods to specify the best time series forecasting model for solving the demand forecasting problem. The proposed methodology for sale time series forecasting is illustrated in Fig. 3 which describes the steps of the proposed method. Trying to overcome the challenges of obtaining an accurate forecasting model and considering the intrinsic characteristics of the

Experimental setup and results

To evaluate the performance of the proposed approach, two statistical time series forecasting methods (ETS (Hyndman et al., 2008), and ARIMA (Box et al., 2015)), and five computational intelligence methods (ANN, SVM, KNN, simple RNN, and the single layer LSTM) were considered. The ETS and ARIMA models were executed and optimized using Statsmodels1 package in Python. The SVM and KNN were implemented and optimized using scikit-learn package2

Conclusion

In this study, we propose a multilayer LSTM network for demand forecasting. The proposed method has the ability to configure an LSTM network which can effectively model patterns of a time series. We compare the proposed method with some well-known time series forecasting techniques from both the statistical and computational intelligence methods categories. To determine whether the performance of the proposed approach is significantly better than the performances of the other methods,

CRediT authorship contribution statement

Hossein Abbasimehr: Conceptualization, Methodology, Writing - original draft, Writing - review & editing. Mostafa Shabani: Software, Writing - review & editing. Mohsen Yousefi: Resources, Writing - review & editing.

Acknowledgement

The authors would like to thank Lukas Hedegaard Jensen for providing language help during the writing of this paper.

References (64)

  • N.J. Johannesen et al.

    Relative evaluation of regression tools for urban area electrical energy demand forecasting

    Journal of Cleaner Production

    (2019)
  • I. Khandelwal et al.

    Time series forecasting using hybrid ARIMA and ANN models based on DWT decomposition

    Procedia Computer Science

    (2015)
  • M. Khashei et al.

    A novel hybridization of artificial neural networks and ARIMA models for time series forecasting

    Applied Soft Computing

    (2011)
  • R. Khatibi et al.

    Investigating chaos in river stage and discharge time series

    Journal of Hydrology

    (2012)
  • R. Law et al.

    Tourism demand forecasting: A deep learning approach

    Annals of Tourism Research

    (2019)
  • C. Lemke et al.

    Meta-learning for time series forecasting and forecast combination

    Neurocomputing

    (2010)
  • H. Liu et al.

    Comparison of two new ARIMA-ANN and ARIMA-Kalman hybrid methods for wind speed prediction

    Applied Energy

    (2012)
  • F. Martínez et al.

    Dealing with seasonality by narrowing the training set in time series forecasting with kNN

    Expert Systems with Applications

    (2018)
  • P.W. Murray et al.

    Forecast of individual customer’s demand from a large and noisy dataset

    Computers & Industrial Engineering

    (2018)
  • S. Panigrahi et al.

    A hybrid ETS–ANN model for time series forecasting

    Engineering Applications of Artificial Intelligence

    (2017)
  • A.R.S. Parmezan et al.

    Evaluation of statistical and machine learning models for time series prediction: Identifying the state-of-the-art and the best conditions for the use of each model

    Information Sciences

    (2019)
  • P. Ramos et al.

    Performance of state space and ARIMA models for consumer retail sales forecasting

    Robotics and Computer-Integrated Manufacturing

    (2015)
  • M.Q. Raza et al.

    Demand forecast of PV integrated bioclimatic buildings using ensemble framework

    Applied Energy

    (2017)
  • M.T. Rosenstein et al.

    A practical method for calculating largest Lyapunov exponents from small data sets

    Physica D: Nonlinear Phenomena

    (1993)
  • A. Sagheer et al.

    Time series forecasting of petroleum production using deep LSTM recurrent networks

    Neurocomputing

    (2019)
  • M.A. Villegas et al.

    A support vector machine for model selection in demand forecasting applications

    Computers & Industrial Engineering

    (2018)
  • Y. Wu et al.

    Remaining useful life estimation of engineered systems using vanilla LSTM neural networks

    Neurocomputing

    (2018)
  • Y. Yang et al.

    Nonlinear response prediction of cracked rotor based on EMD

    Journal of the Franklin Institute

    (2015)
  • Y. Yang et al.

    Modelling a combined method based on ANFIS and neural network improved by DE algorithm: A case study for short-term electricity demand forecasting

    Applied Soft Computing

    (2016)
  • Adhikari, R., & Agrawal, R. K. (2013). An introductory study on time series modeling and forecasting. arXiv preprint...
  • G.S. Atsalakis et al.

    Stock trend forecasting in turbulent market periods using neuro-fuzzy systems

    Operational Research

    (2016)
  • Y. Bengio et al.

    Learning long-term dependencies with gradient descent is difficult

    IEEE Transactions on Neural Networks

    (1994)
  • Cited by (270)

    View all citing articles on Scopus
    View full text