Introduction

Stock, as one form of securities, is a paper sheet as proof of someone’s ownership in a company [1]. Since it can be exchanged easily, it is popularly used as a kind of investment by many people nowadays. In fact, the stock has become one of the most popular financial market instruments [2] and has formed one of the most massive and complex financial markets in the world—the stock market. The market could handle millions of transactions in a short period and is a highly dynamic environment [3]. It also has nonlinear characteristics and could be highly unpredictable [4, 5].

Many have tried to study and predict the stock’s price movement, which is considered as the crucial step in developing trading strategies [6]. A good trading strategy could help traders in avoiding losses and gaining more profit. Different kinds of forecasting methods and approaches have been applied and developed in order to achieve that. Some have used conventional and technical analysis methods, such as moving average, ARIMA, and stochastic optimization method, but the accuracy results are not satisfactory enough [5]. Hence, others move to a more emerging technique and approach, such as machine learning or deep learning methods, for the stock price prediction problem.

One of the most widely used deep learning methods, especially for time series analysis, is the long short-term memory (LSTM) networks. As a better version of recurrent neural networks (RNN), LSTM has been used in many fields. Particularly in stock price prediction, LSTM had been proposed and applied by some notable researchers, such as Murtaza et al. [7], Nelson et al. [8], Faurina et al. [6], and Jin et al. [9]. Murtaza et al. [7] had built a model and predict the stock returns of NIFTY 50 by using LSTM networks. Similarly, Nelson et al. [8] also had applied LSTM networks to predict the future trends of stock prices based on its historical data. They applied their proposed LSTM networks on five different stocks from the Brazilian stock exchange and found that the outcomes are very promising. Faurina et al. [6] had tried a slightly different approach in predicting stock movement on the Indonesian market. They used three different dimensional reduction methods, namely Lasso, ElasticNet, and principal component analysis (PCA), that were combined with the LSTM networks for the stock price prediction. They found that the PCA-LSTM combination outperforms the other two combinations. Lastly, Jin et al. [9] had recently published their findings in the stock closing price prediction. In their study, they incorporated the investors’ emotional tendency with LSTM networks and found that their proposed model could improve the prediction results.

In this study, we also aim to implement the famous deep learning method, i.e., the LSTM networks, for the stock price prediction, specifically for stocks that are included in the LQ45 indices. LQ45 consists of 45 companies’ stocks with high liquidity and huge market share; and therefore, are considered to have good financial status [10]. The Indonesia stock exchange (IDX) has periodically assessed all companies’ performance in the stock market and announced the report of LQ45 indices semi-annually [11]. This report has been used by many stakeholders, including traders and investors, as one of their decision-making tools. Moreover, many studies also had used this report as their criteria in choosing the considered stocks for their evaluation, as can be found on Nurmalitasari et al. [12], Pantagama [11], and Pramanaswari and Yasa [13]. Interested readers are encouraged to read their publications for further information.

Although many studies have used the LSTM networks in predicting the stock price movement, most of them incorporated many hidden layers and neurons in the deep network’s architecture [7, 14,15,16]. Here, we argue that using simpler and not too deep network’s architecture could get similar prediction results with those that used more complex networks. Briefly, the contributions of this study are (1) we propose a simple three layers LSTM network architecture in predicting the stock’s closing price, (2) we use the newly announced LQ45 report from the IDX for the period of February–July 2021, especially those that are included in the financial sector indices, and (3) we find that the proposed model could give good prediction results and serves as a baseline for a good trading strategy.

We will further discuss the research method used in this study, namely the LSTM networks in the following section. Some error criteria also will be described in the section. Moreover, the description of datasets being used in this study, the experimental results, and the analysis will be discussed. Lastly, some concluding remarks and suggestions will be given in the last section of this paper.

Research method

A simple diagram, which shows the research methodology applied in this study, is shown in Fig. 1. We started the process by handling any missing values in the stocks ‘Closing’ price with a simple data imputation method. Then, we divided the cleaned stocks’ data into training and test set with an 80:20 ratio. On both sets, we conducted the feature scaling method to get normalized training and test sets. Next, we reshaped both sets into 3D arrays that can be further processed on the prepared LSTM networks. We built the deep learning model using a three-layer LSTM network consisting of an LSTM layer, a dropout layer to prevent overfitting, and a dense layer as the output layer of the networks. To get the prediction results, we used the built model on the reshaped test set and converted it back to the original scale. Lastly, we calculated the root mean square error (RMSE) and the mean absolute percentage error (MAPE) scores in the performance evaluation process of the built model. A more detailed explanation of each step taken in this study will be given in the following section.

Fig. 1
figure 1

Research methodology

RNN-LSTM

The long short-term memory (LSTM) is an advanced soft computing method that Hochreiter and Schmidhuber first introduced to tackle the limitation found in the conventional recurrent neural networks (RNN) method, especially in solving problems with the long-term dependency issue [17]. Typically, it consists of several LSTM cells that are self-connected and used to store the networks’ temporal state by using three gates, namely input, output, and forget gates [17]. An illustration of an LSTM cell containing those three gates is depicted in Fig. 2 [18].

Fig. 2
figure 2

Three gates mechanism in an LSTM cell

The gate mechanism in an LSTM cell is used to control how much information can be passed throughout the networks. The forget gate can be found in the first part of the cell and is used to control how much of the previous cell’s hidden state could be forgotten. Next, the input gate is used to determine what new information will be stored in the current cell state. Finally, as its name inferred, the output gate is used to find the value we want to be the output of the current cell [19]. Some equations related to this mechanism in an LSTM cell are given below.

$${f}_{t}=\sigma \left({W}_{f}{h}_{t-1}+{U}_{f}{x}_{t}+{b}_{f}\right),$$
(1)
$${i}_{t}=\sigma \left({W}_{i}{h}_{t-1}+{U}_{i}{x}_{t}+{b}_{i}\right),$$
(2)
$${\tilde{C }}_{t}=\mathrm{tanh}\left({W}_{C}{h}_{t-1}+{U}_{C}{x}_{t}+{b}_{C}\right),$$
(3)
$${C}_{t}={f}_{t} \odot {C}_{t-1}+{i}_{t} \odot {\tilde{C }}_{t},$$
(4)
$${o}_{t}=\sigma \left({W}_{o}{h}_{t-1}+{U}_{o}{x}_{t}+{b}_{o}\right),$$
(5)
$${h}_{t}={o}_{t} \odot {\mathrm{tanh}}\left({C}_{t}\right),$$
(6)

where \({f}_{t}\) is the forget gate value at the current cell, \({i}_{t}\) is the input gate value, \({C}_{t}\) is the current cell state, \({\tilde{C }}_{t}\) is the cell candidate value, \({o}_{t}\) is the output gate value, \({W}_{f}, {W}_{i}, {W}_{C},{W}_{o}\) and \({U}_{f}, {U}_{i}, {U}_{C},{U}_{o}\) are weights of the networks, \({b}_{f}, {b}_{i}, {b}_{C}, {b}_{o}\) are bias variable values, \({h}_{t}\) is the current hidden state value, \({h}_{t-1}\) is the prior hidden state value, and \({x}_{t}\) is the new input value at the current cell. There are two activation functions (AFs) being used here, namely the sigmoid activation function (\(\sigma )\) and the tanh activation function. Both of them are the most frequently used nonlinear AFs in the artificial neural networks [20].

Error criteria

There are two prediction error criteria being used as the performance evaluation metrics in this study, i.e., the root mean square error (RMSE) and the mean absolute percentage error (MAPE) criteria. While RMSE will give the degree of error in a unit value, MAPE will give the degree of error in a percentage value. As explained by Shahid et al. [17] and Hansun et al. [21], both of them can be represented as

$$RMSE=\sqrt{\frac{1}{n}\sum_{t=1}^{n}{\left({Y}_{t}-{F}_{t}\right)}^{2},}$$
(7)
$$MAPE=\left(\frac{1}{n}\sum_{t=1}^{n}\left|\frac{{Y}_{t}-{F}_{t}}{{Y}_{t}}\right|\right)\bullet 100\%,$$
(8)

where \(n\) is the total number of data, \({Y}_{t}\) is the actual value, and \({F}_{t}\) is the forecasted value.

Result and analysis

This section is divided into three sub-sections. Firstly, we describe the data source, pre-processing steps, and model development conducted in this study. Secondly, the experimental results for each considered stock in LQ45 indices will be given. Moreover, the evaluation results and analysis will be discussed following the experimental results.

Data source, pre-processing, and model development

In this study, we used some stocks included in the LQ45 indices as semi-annually reported by the Indonesia Stock Exchange (IDX). The last (Major Evaluation) Report is for the period of February to July 2021 and can be downloaded from the IDX website [22]. For simplicity, we will focus on some stocks on the list included in the financial sector, as shown in Table 1.

Table 1 LQ45 financial sector stocks

Next, we collected the recorded daily stock prices for each considered stock from Yahoo! Finance [23]. We chose to download the maximum data available in the data source; hence each stock could have a different number of data records. There are several features in the collected datasets, such as ‘Date,’ ‘Open,’ ‘High,’ ‘Low,’ ‘Close,’ ‘Adjusted Close,’ and ‘Volume,’ but we will only consider the ‘Close’ values of each set. The downloaded data then were pre-processed to handle any missing values using a simple data imputation method by replacing the missing values with their last known records. Moreover, xwe used an 80:20 ratio for training and test sets of each considered stock in the data splitting process. The resulted training and test sets are shown in Table 2.

Table 2 Training and test sets of LQ45 financial sector stocks

On both the training and test sets, we further processed them for data normalization by using the feature scaling method. This step was done after the data splitting process because we normalized both training and test sets based on the training set scale. This is intuitive since the model will be trained on the training set, not on the test set, so the scaling should be based on the training set scale on both training and test sets. Next, we reshaped the data into a 3D-array shape accepted by the LSTM model in Keras, a deep learning package for Python. Keras runs on top of the TensorFlow machine learning platform.

This study proposed a three-layer LSTM network containing an LSTM layer, a dropout layer, and a dense layer. There are 100 neurons being used in the LSTM layer; meanwhile, for the dropout regularisation, we chose to drop 20% of the processed information in the networks to prevent overfitting. For the loss function in the networks, we used the simple mean square error (MSE) with Adam optimizer. Moreover, we trained the model on the training set for 20 training epochs with a batch size of 32 each. The built model then will be used to predict the Closing price on the test set. However, we need to do the data inversion phase to convert the predicted results into the original scaling of the data. Figure 3 shows a snippet code of the proposed LSTM model in this study. The prediction results based on the built model using the proposed LSTM networks are described in the following section.

Fig. 3
figure 3

The proposed three layers LSTM network

Experimental results

We used a machine with an Intel Core i3-8130U CPU @ 2.20 GHz (4 CPUs) processor and 8 GB of RAM in the experimental phase. We also used several libraries in conducting the experiments, such as Pandas, Matplotlib, Keras, and Scikit-learn with Python programming language on the Jupyter Notebook in Anaconda 3 environment.

The prediction results on all considered stocks included in the LQ45 financial sector indices are shown in Fig. 4. Furthermore, the loss function results obtained during the model training are shown in Fig. 5. As can be inferred from Fig. 4, the proposed LSTM model could predict all considered stocks data very well, except for BTPS. Moreover, from Fig. 5, we know that all built models have converged quite well and remained stable to be used in the test phase.

Fig. 4
figure 4figure 4

LQ45 financial sector indices’ prediction results

Fig. 5
figure 5

LQ45 financial sector indices’ loss function plots

Analysis

To get the performance results of the built model on the test set, we evaluated the prediction results by using the root mean square error (RMSE) and the mean absolute percentage error (MAPE) criteria. Table 3 presents the RMSE and MAPE values for each considered stock in this study. As evinced from the results, the RMSE values are ranged from 266.1255 to 2878.4668, which depend not only on the prediction results but also on the magnitude of the stock’s closing prices. Moreover, there are two stocks that have MAPE values under 20%, namely BMRI (18.6135) and BBCA (19.1020). The typical interpretation of MAPE values, as stated by Moreno et al. [24], is highly accurate forecasting (< 10), good forecasting (10–20), reasonable forecasting (20–50), and inaccurate forecasting (> 50). Therefore, the prediction results on BMRI and BBCA stock prices fall in good forecasting results, while all other prediction results have reasonable results based on the MAPE values interpretation.

Table 3 RMSE and MAPE values

We also compare the results of this study to other similar studies that tried to predict future values of stock prices using various kinds of machine learning or deep learning methods. Table 4 depicts the comparison results of this study with other similar studies. As can be inferred from the MAPE values, our proposed simple three-layer LSTM networks could achieve similar results with other more complex algorithms implementation, even the hybrid ones.

Table 4 Comparison with other studies

Conclusion

In this study, we have successfully implemented the well-known Deep Learning methods, i.e., the LSTM networks, for the stock price prediction, specifically for stocks included in the LQ45 financial sector indices. There are six stocks considered in this study, namely BBCA (Bank Central Asia Tbk.), BBNI (Bank Negara Indonesia Tbk.), BBRI (Bank Rakyat Indonesia Tbk.), BBTN (Bank Tabungan Negara Tbk.), BMRI (Bank Mandiri Tbk.), and BTPS (Bank BTPN Syariah Tbk.). Using a simple three-layer LSTM network architecture, we found that the proposed model gave reasonable prediction results for all considered stocks. For BBCA and BMRI, the model could get good prediction results, and therefore, it is more recommended to use the built model for those two stocks in the LQ45 financial sector indices.

There are also some limitations in this study that are worthy of being noted. As previously explained in the text, we only did simple data imputation technique to handle any missing values in the dataset. However, as shown in the prediction plot for BBNI in Fig. 4, almost half of the stocks’ closing prices in the training set are stagnant at the value of 1800 s. It seems that there are some periods when the stock’s price data for BBNI in Yahoo! Finance are not recorded properly. This could affect the prediction results as shown by the high MAPE value, and therefore, some other pre-processing techniques need to be considered if the stock will be used in the future.

Moreover, we only used two well-known forecast error criteria in this study as the performance evaluation metrics, namely the RMSE and MAPE. Other criteria, including the R-squared score or coefficient of determination, could also be applied to get better and comprehensive analysis results [29]. Another possible future work is to compare the results of this study with other technical approaches, such as weighted exponential moving average [30] and double exponential smoothing [31] methods.