An optimized model using LSTM network for demand forecasting

doi:10.1016/j.cie.2020.106435

Computers & Industrial Engineering

Volume 143, May 2020, 106435

https://doi.org/10.1016/j.cie.2020.106435 Get rights and content

Highlights

•
A demand forecasting method based on multi-layer LSTM networks is proposed.
•
The proposed method improves the forecasting accuracy.
•
It has strong ability to capture nonlinear patterns in time series data.
•
The empirical results show that the method outperforms other standard techniques.

Abstract

In a business environment with strict competition among firms, accurate demand forecasting is not straightforward. In this paper, a forecasting method is proposed, which has a strong capability of predicting highly fluctuating demand data. Therefore, in this paper we propose a demand forecasting method based on multi-layer LSTM networks. The proposed method automatically selects the best forecasting model by considering different combinations of LSTM hyperparameters for a given time series using the grid search method. It has the ability to capture nonlinear patterns in time series data, while considering the inherent characteristics of non-stationary time series data. The proposed method is compared with some well-known time series forecasting techniques from both statistical and computational intelligence methods using demand data of a furniture company. These methods include autoregressive integrated moving average (ARIMA), exponential smoothing (ETS), artificial neural network (ANN), K-nearest neighbors (KNN), recurrent neural network (RNN), support vector machines (SVM) and single layer LSTM. The experimental results indicate that the proposed method is superior among the tested methods in terms of performance measures.

Introduction

Demand forecasting is the basis of all planning activities (Haberleitner, Meyr, & Taudes, 2010). Specifically, demand prediction as a predictive analytics task is considered an essential tool to gain an understanding of future demand. Accurate demand forecasting guarantees suitable supply chain management, and enhances customer satisfaction by preventing inventory stock-out (Kumar, Shankar, & Alijohani, 2019).

Strict competition among firms in any domain has made it difficult for businesses to accurately forecast the customers’ demands using traditional demand forecasting methods (Guo et al., 2017, Kumar et al., 2019). Therefore, companies are increasingly moving toward the use of advanced data science techniques to forecast customer demand. In general, customer demand is modeled as a sequential data of customer demands over time. Hence, demand forecasting problem can be formulated as a time series forecasting problem (Villegas, Pedregal, & Trapero, 2018). Time series prediction has been applied in various areas of application such as credit scoring (Lin, Hu, & Tsai, 2011), electricity load forecasting (Johannesen et al., 2019, Raza et al., 2017), forecasting call center arriving calls (Taylor, 2008) tourism demand forecasting (Law, Li, Fong, & Han, 2019), ATM cash demand forecasting (Martínez, Frías, Pérez, & Rivera, 2017), forecasting of petroleum production (Sagheer & Kotb, 2019), weather forecasting (Maqsood, Khan, & Abraham, 2004), etc.

Generally, time series forecasting techniques fall into the two main categories of statistical and computational intelligence methods (Khashei & Bijari, 2011). Widely-used statistical time series forecasting methods such as ARIMA suppose that the time series contains only linear components. However, most real-world time series data consist of nonlinear components too. To address forecasting of time series with nonlinear patterns, several nonlinear statistical methods have been developed; for example, the autoregressive conditional heteroscedastic (ARCH) model, general autoregressive conditional heteroscedastic (GARCH) (Khashei & Bijari, 2011). However, there are many variations of these models (Enders, 2008), each suitable at modeling only a specific nonlinearity. This causes the procedure of finding a proper model for time series to become more complex.

Recently, computational intelligence techniques including artificial neural networks (ANN), support vector machine (SVM), K-nearest neighbors (KNN), and adaptive neuro-fuzzy inference system (ANFIS) have been frequently used for the problem of time series prediction.

ANNs have several advantageous characteristics, including universal approximation, being data driven, and the ability to better capture nonlinear patterns in data (Khashei and Bijari, 2011, Panigrahi and Behera, 2017). A specific class of ANNs are recurrent neural networks (RNNs). Unlike in feedforward ANNs, the connections between nodes in an RNN establish a cycle which allows signals to move in different directions (Parmezan, Souza, & Batista, 2019). RNNs provide a short-term memory by storing the activations from each time step. This makes it a suitable technique for processing sequence data (Parmezan et al., 2019). The weakness of RNNs is the vanishing and exploding gradient problem, which makes it hard to train (Bengio et al., 1994, Parmezan et al., 2019). The prevalent solution to overcome this weakness is to use gated architectures such as LSTM (Hochreiter & Schmidhuber, 1997), which can exploit longer-range timing information (Wu et al., 2018, Xin et al., 2018).

Although different types of ANNs can capture nonlinear patterns of time series data, research have indicated that the ANNs with shallow architectures are unable to accurately model time series with a high degree of nonlinearity, longer range and heterogeneous characteristics (Sagheer and Kotb, 2019, Taieb et al., 2012). In addition, it is demonstrated that deep neural network architectures have better generalization than shallow architectures (Hermans and Schrauwen, 2013, Utgoff and Stracuzzi, 2002).

In this study, we propose a method based on a multi-layer LSTM network by using the grid search approach. The proposed method searches for the optimal hyperparameters of the LSTM network. The capability to capture nonlinear patterns in time series data is one of the main advantages of our method. We apply the proposed method on real-world demand data of a furniture company, and compare it to other state-of the-art time series forecasting techniques. The results of these methods are compared, and we demonstrate that the model built by employing the proposed method performs substantially better than the alternatives.

The rest of this paper is organized as follows. Section 2 gives a comprehensive literature review on time series forecasting along with a brief description of methods utilized throughout this paper. In Section 3, we describe the proposed method. Section 4 presents a case study using the proposed method as well as the methods used for comparison. Also, this section analyzes and compares the results of utilized methods. In Section 5, we draw a conclusion and suggest future works.

Section snippets

Related work

In this section, we firstly present a comprehensive literature review on time series forecasting and identify the utilized techniques and the context of each study. Afterward, the utilized forecasting techniques throughout this paper are described.

Time series forecasting has been applied in many areas of application. Table 1 summarizes prevalent research in the context of time series forecasting, that have been published during the past decade. The table highlights contribution, modeling

The proposed methodology

The aim of this study is to obtain an accurate model for demand forecasting of a furniture company. We exploit recent deep learning methods to specify the best time series forecasting model for solving the demand forecasting problem. The proposed methodology for sale time series forecasting is illustrated in Fig. 3 which describes the steps of the proposed method. Trying to overcome the challenges of obtaining an accurate forecasting model and considering the intrinsic characteristics of the

Experimental setup and results

To evaluate the performance of the proposed approach, two statistical time series forecasting methods (ETS (Hyndman et al., 2008), and ARIMA (Box et al., 2015)), and five computational intelligence methods (ANN, SVM, KNN, simple RNN, and the single layer LSTM) were considered. The ETS and ARIMA models were executed and optimized using Statsmodels¹ package in Python. The SVM and KNN were implemented and optimized using scikit-learn package²

Conclusion

In this study, we propose a multilayer LSTM network for demand forecasting. The proposed method has the ability to configure an LSTM network which can effectively model patterns of a time series. We compare the proposed method with some well-known time series forecasting techniques from both the statistical and computational intelligence methods categories. To determine whether the performance of the proposed approach is significantly better than the performances of the other methods,

CRediT authorship contribution statement

Hossein Abbasimehr: Conceptualization, Methodology, Writing - original draft, Writing - review & editing. Mostafa Shabani: Software, Writing - review & editing. Mohsen Yousefi: Resources, Writing - review & editing.

Acknowledgement

The authors would like to thank Lukas Hedegaard Jensen for providing language help during the writing of this paper.

References (64)

R.R. Andrawis et al.
Forecast combinations of computational intelligence and linear models for the NN5 time series forecasting competition
International Journal of Forecasting
(2011)
G.S. Atsalakis
Using computational intelligence to forecast carbon prices
Applied Soft Computing
(2016)
C.N. Babu et al.
A moving-average filter based hybrid ARIMA–ANN model for forecasting time series data
Applied Soft Computing
(2014)
Ü.Ç. Büyükşahin et al.
Improving forecasting accuracy of time series data using a new ARIMA-ANN hybrid method and empirical mode decomposition
Neurocomputing
(2019)
Y. Chen et al.
A hybrid application algorithm based on the support vector machine and artificial intelligence: An example of electric load forecasting
Applied Mathematical Modelling
(2015)
J.F. de Oliveira et al.
A hybrid evolutionary decomposition system for time series forecasting
Neurocomputing
(2016)
R.F. Engle et al.
Forecasting and testing in co-integrated systems
Journal of econometrics
(1987)
M.A. Ghorbani et al.
A probe into the chaotic nature of daily streamflow time series by correlation dimension and largest Lyapunov methods
Applied Mathematical Modelling
(2010)
F. Guo et al.
A double-level combination approach for demand forecasting of repairable airplane spare parts based on turnover data
Computers & Industrial Engineering
(2017)
H. Haberleitner et al.
Implementation of a demand planning system using advance order information
International Journal of Production Economics
(2010)

N.J. Johannesen et al.

Relative evaluation of regression tools for urban area electrical energy demand forecasting

Journal of Cleaner Production

(2019)

I. Khandelwal et al.

Time series forecasting using hybrid ARIMA and ANN models based on DWT decomposition

Procedia Computer Science

(2015)

M. Khashei et al.

A novel hybridization of artificial neural networks and ARIMA models for time series forecasting

Applied Soft Computing

(2011)

R. Khatibi et al.

Investigating chaos in river stage and discharge time series

Journal of Hydrology

(2012)

R. Law et al.

Tourism demand forecasting: A deep learning approach

Annals of Tourism Research

(2019)

C. Lemke et al.

Meta-learning for time series forecasting and forecast combination

Neurocomputing

(2010)

H. Liu et al.

Comparison of two new ARIMA-ANN and ARIMA-Kalman hybrid methods for wind speed prediction

Applied Energy

(2012)

F. Martínez et al.

Dealing with seasonality by narrowing the training set in time series forecasting with kNN

Expert Systems with Applications

(2018)

P.W. Murray et al.

Forecast of individual customer’s demand from a large and noisy dataset

Computers & Industrial Engineering

(2018)

S. Panigrahi et al.

A hybrid ETS–ANN model for time series forecasting

Engineering Applications of Artificial Intelligence

(2017)

A.R.S. Parmezan et al.

Evaluation of statistical and machine learning models for time series prediction: Identifying the state-of-the-art and the best conditions for the use of each model

Information Sciences

(2019)

P. Ramos et al.

Performance of state space and ARIMA models for consumer retail sales forecasting

Robotics and Computer-Integrated Manufacturing

(2015)

M.Q. Raza et al.

Demand forecast of PV integrated bioclimatic buildings using ensemble framework

Applied Energy

(2017)

M.T. Rosenstein et al.

A practical method for calculating largest Lyapunov exponents from small data sets

Physica D: Nonlinear Phenomena

(1993)

A. Sagheer et al.

Time series forecasting of petroleum production using deep LSTM recurrent networks

Neurocomputing

(2019)

M.A. Villegas et al.

A support vector machine for model selection in demand forecasting applications

Computers & Industrial Engineering

(2018)

Y. Wu et al.

Remaining useful life estimation of engineered systems using vanilla LSTM neural networks

Neurocomputing

(2018)

Y. Yang et al.

Nonlinear response prediction of cracked rotor based on EMD

Journal of the Franklin Institute

(2015)

Y. Yang et al.

Modelling a combined method based on ANFIS and neural network improved by DE algorithm: A case study for short-term electricity demand forecasting

Applied Soft Computing

(2016)

Adhikari, R., & Agrawal, R. K. (2013). An introductory study on time series modeling and forecasting. arXiv preprint...

G.S. Atsalakis et al.

Stock trend forecasting in turbulent market periods using neuro-fuzzy systems

Operational Research

(2016)

Y. Bengio et al.

Learning long-term dependencies with gradient descent is difficult

IEEE Transactions on Neural Networks

(1994)

Cited by (270)

Enhanced abnormal data detection hybrid strategy based on heuristic and stochastic approaches for efficient patients rehabilitation
2024, Future Generation Computer Systems
Over the last few years, substantial research has been conducted towards developing efficient abnormal detection techniques while considering efficiency, accuracy, high-dimensional data, distributed environments, and others. Researchers increasingly deal with “abnormalities” in clinical patient data to derive relevant clinical knowledge for making informed decisions. However, data collection for clinically relevant research is often guided by patient conditions and administrative or clinical requirements rather than a regular schedule. Therefore, clinical data is frequently obtained in an unreliable form, characterized by data outliers and inconsistencies, incomplete information, and an unstructured format that varies based on patient types and data structures. In this research study, an enhanced hybrid AD strategy is developed based on heuristic and stochastic methods to cope with abnormalities in the clinical data of patients. The proposed hybrid strategy employs optimal k-means clustering as a heuristic method to cluster the clinical data based on the patient’s routine exercise characteristics to cope with abnormalities efficiently. Next, an interquartile range-based stochastic approach is employed as a statistical method to detect and eliminate abnormal data points by providing only reliable and effectual data to medical practitioners. The main objective of this research article is to facilitate healthcare and research practitioners by dealing with a high dimensional massive amount of inconsistent and incomplete clinical data of patients to detect and discard anomalous data points for providing only efficacious information. Furthermore, the AutoML paradigm is employed to develop an optimal regression model for analyzing the impact of the proposed hybrid strategy for abnormal pattern detection. In addition, different statistical error estimation measures are used to evaluate the empirical effectiveness of the proposed hybrid strategy using AutoML. The experiment results show a noteworthy improvement in terms of the R2 score for predicting healthcare indicators compared to the existing state-of-the-art regression models. Our optimal regression model performed efficiently regarding the R2 score and MAPE; it achieved an R2 score of 0.9855 and 0.9850 for predicting the Borg RPE and TUG, respectively. Similarly, our model achieved a low prediction error in terms of MAPE for predicting both health functional indicators; it achieved a MAPE of 6.57% and 5.19% for Borg RPE and TUG prediction. Our contribution signifies that the performance of the AutoML improves and outperforms traditional regression models while applying our proposed hybrid abnormal detection model to the patient’s rehabilitation data for accurately dealing with anomalous data.
Improving estimation capacity of a hybrid model of LSTM and SWAT by reducing parameter uncertainty
2024, Journal of Hydrology
Hybrid models coupling process-based models with deep learning models have been widely used for estimating water quantity and quality. However, the impacts of uncertainty within process-based models on the performance of hybrid models remained largely unknown. This study focused on assessing the impact of uncertainties on a hybrid model that combines Soil and Water Assessment Tool (SWAT) with Long Short-Term Memory (LSTM) for estimating streamflow and suspended solids (SS). This study applied the output of SWAT as input for LSTM to make a hybrid model. By incorporating an additional constraint, remotely sensed evapotranspiration (RS-ET), the performance of hybrid models with and without considering RS-ET was evaluated at various temporal and spatial scales. Furthermore, various input settings (default and calibrated SWAT and inclusion/exclusion of precipitation) were considered. The results showed that the hybrid models tended to provide accurate estimations of monthly streamflow and SS compared to standalone SWAT and LSTM. The addition of precipitation to hybrid models did not make noticeable improvements for streamflow and SS. The hybrid models constrained by RS-ET tended to have improved estimations on streamflow and SS at daily and monthly scales compared to those unconstrained by RS-ET. Similar results were also observed at the sub-watershed level. These insights highlighted the potential of hybrid models to enhance hydrological estimations and underscore the importance of incorporating additional constraints to reduce uncertainties. Such advancements have practical implications for water resource management and decision-making processes, enabling more reliable and accurate estimations in applications of hybrid models.
Determinants of the price of bitcoin: An analysis with machine learning and interpretability techniques
2024, International Review of Economics and Finance
In this paper we investigate the variables that influence the trading price of bitcoin. Utilizing a Long Short-Term Memory (LSTM) neural network, a flexible machine learning model, we determine bitcoin's price based on various economic, technological, and investor attention factors. The LSTM model replicates bitcoin price behavior reasonably well across different time periods. We then employ the SHAP interpretability approach to identify the most important features affecting the LSTM's outcome. We conclude that, over time, technological variables decrease in importance, while those related to investor attention gain prominence. Moreover, beyond the shifting influence of variables, new explanatory factors seem to appear over time that, at least for the most part, remain initially unknown. Improving the understanding of the factors that influence price formation, as well as its possible (in)stability over time, could help anticipate real risks to the system and, support the design of a regulatory framework that helps contain them.
A novel hybrid model to forecast seasonal and chaotic time series
2024, Expert Systems with Applications
Accurate time series forecasting is crucial, particularly in real-world application areas such as demand forecasting. The Prophet model successfully predicts time series containing well-known seasonal patterns but performs less effectively on time series exhibiting chaotic patterns. To overcome this, we propose a novel methodology that combines the empirical mode decomposition (EMD) technique and its noise-assisted variations with the Prophet model. One key property of this methodology is its ability to analyze the seasonality and chaos present in each time series and recommend a suitable forecasting method accordingly. The proposed methodology decomposes chaotic time series into multiple sub-series at various frequencies using EMD and its two noise-assisted extensions: Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN) and Improved CEEMDAN. Subsequently, models are fitted for each sub-series using the Prophet model, and the resulting forecasts are aggregated to generate a final prediction. We conducted an empirical study using seven time series datasets. Experimental results indicate that the proposed methodology generates accurate models that outperform benchmark methods when applied to time series with chaotic characteristics. Furthermore, the models obtained using our proposed methodology exhibit higher accuracy compared to the best models selected from the existing literature.
A data-driven construction method of aggregated value chain in three phases for manufacturing enterprises
2024, Computers and Industrial Engineering
Under the dual influence of dual-carbon policies and the complex international situation, manufacturing enterprises urgently need to build digital value chain ecosystems to adapt to changes in market demand. However, in this process, manufacturing enterprises face many challenges such as multi-stage test data and heterogeneous information sources, large errors in product demand prediction, and uncertainty in maintenance thresholds and maintenance sequences. To address these challenges, a data-driven value chain integration approach is proposed in this paper to build a digital value chain ecosystem for enterprises. In the product design phase, a Bayesian assessment technique is introduced to integrate information from multiple sources. In the product sales phase, a two-stage sales prediction method is proposed. A planning model with variance maximization, correlation minimization, and estimation error minimization as decision objectives is introduced for weight search with ensemble learning. In the after-sales stage of the product, a cost-optimized dynamic maintenance strategy with a maintenance threshold as a constraint is constructed using the number of maintenance and the maintenance interval as decision variables. In addition, cost-based maintenance prioritization is included. Finally, the applicability of the proposed method is verified by the production data of a real manufacturing enterprise. Our findings suggest that the proposed approach helps manufacturing enterprises to aggregate, co-create, and upgrade their value chains. At the same time, manufacturing enterprises should actively build digital value chain ecosystems and dynamically adjust their maintenance strategies according to the production process and life cycle of their products to obtain the best economic benefits.
Modeling social coupon redemption decisions of consumers in food industry: A machine learning perspective
2024, Technological Forecasting and Social Change
Social couponing is a growing promotional phenomenon in the service industry. However, since the conversion rate of distributed coupons into coupons redeemed for purchase is relatively low, there is a need to understand the redemption decisions of consumers. Lower conversion rates lead businesses to lose both customers and profits. Previous studies have typically focused on social couponing from a business perspective, without exploring factors from the customer's end. The current study explores the factors influencing customers' decision to redeem coupons and highlights the interrelationships between the factors. Data were collected from 353 online customers on their redemption experiences during their food purchases. Structural equation modeling was performed to examine the significance of the factors and establish the predictability of customers' redemption decisions. We then explored different machine learners to identify the best-fitting models for customers' redemption decisions. Results showed that the prediction accuracy of the decision-tree-based models was the highest. These models delineate the role of influencers in various redemption aspects and validate the mediation effects of perceived risk, deal proneness, referral, and consumption frequency. The study also highlights future research areas in the social couponing domain.

View all citing articles on Scopus

View full text

An optimized model using LSTM network for demand forecasting

Highlights

Abstract

Introduction

Section snippets

Related work

The proposed methodology

Experimental setup and results

Conclusion

CRediT authorship contribution statement

Acknowledgement

International Journal of Forecasting

Applied Soft Computing

Applied Soft Computing

Neurocomputing

Applied Mathematical Modelling

Neurocomputing

Journal of econometrics

Applied Mathematical Modelling

Computers & Industrial Engineering

International Journal of Production Economics

Journal of Cleaner Production

Procedia Computer Science

Applied Soft Computing

Journal of Hydrology

Annals of Tourism Research

Neurocomputing

Applied Energy

Expert Systems with Applications

Computers & Industrial Engineering

Engineering Applications of Artificial Intelligence

Information Sciences

Robotics and Computer-Integrated Manufacturing

Applied Energy

Physica D: Nonlinear Phenomena

Neurocomputing

Computers & Industrial Engineering

Neurocomputing

Journal of the Franklin Institute

Applied Soft Computing

Stock trend forecasting in turbulent market periods using neuro-fuzzy systems

Operational Research

Learning long-term dependencies with gradient descent is difficult

IEEE Transactions on Neural Networks