Stock investment strategy combining earnings power index and machine learning
Introduction
Predicting stock markets is a challenging task for both academics and investors in an effort to increase returns on investment. Information that affects stock prices ranges from macroeconomic factors, such as economic growth rates and exchange rates, to firm-specific data. Stakeholders rely on financial statements to obtain important information about future profits and companies’ intrinsic values. As a result, financial statements can help investors make rational decisions and build better investment portfolios (Kothari, 2001, Richardson et al., 2010).
Accounting and finance research has long supported the view that key financial indicators derived from fundamental analyses have significant predictive power for future earnings and explanatory potential with respect to the intrinsic value of a company (Abarbanell and Bushee, 1997, Abarbanell and Bushee, 1998, Lev and Thiagarajan, 1993, Mohanram, 2005, Penman and Zhang, 2002, Penman and Zhang, 2006, Piotroski, 2000; Wieland, 2011; Wahlen and Wieland, 2011). These studies show that financial variables can predict the direction of future earnings, and that an investment strategy that takes advantage of these forecasts can perform well. However, the conventional approach to financial statement analysis has been criticized because it selects variables arbitrarily without the benefit of a theoretical background (Richardson et al., 2010, Shin et al., 2017). To address these drawbacks, Penman and Zhang (2006), who focus on the sustainability of earnings using the residual income model, suggest application of a summary measure known as the predicted earnings increase score (PEIS). Wahlen and Wieland (2011) successfully apply Penman and Zhang’s index. Other studies report that the momentum effect can improve the accuracy of forecasts of the direction of future profit (Fairfield et al., 1996, Hirst et al., 2007). Following these strands of research, Song et al. (2020) propose an Earnings Power Index (EPI), which adds to PEIS indicators three potential candidate factors derived from time-series trends. This index has the comprehensible list of elements that have been proven to be predictive of future earnings. Here, we exploit the individual factors of EPI.
Previous studies focused on predicting future profit, using logistic or linear regression models to select significant variables (Abarbanell and Bushee, 1997, Lev and Thiagarajan, 1993, Nissim and Penman, 2001, Ou and Penman, 1989, Penman and Zhang, 2006). However, identifying the complex relationships among variables with the linear models can be difficult. Meanwhile, an algorithm customized for big data is necessary when the amount and complexity of data increase and infrastructure can be established to efficiently collect and manage data (Fayyad et al., 1996). To this end, machine-learning techniques have been utilized to estimate an unknown function that connects inputs and outputs. In the field of stock price prediction, machine-learning techniques have been applied through two approaches; fundamental analysis and technical analysis (Ballings et al., 2015, Bustos and Pomares-Quimbaya, 2020, Nti et al., 2020). Technical analysis has been used for predictions based on immediate price trends (Bustos and Pomares-Quimbaya, 2020, Nti et al., 2020), whereas in fundamental analysis, data become available at the disclosure of financial statements, making the approach appropriate for mid- to long-term forecasting.
Feature engineering is important to determine final model performance in most machine learning applications. Rather than using many raw variables, selective and sophisticated derived variables can provide better performance in most machine learning applications. In this research, rather than using more numbers of raw variables in a firm’s financial statements (Tsai et al., 2011, Ballings et al., 2015, Bao et al., 2020), EPI-related factors are selected as derived variables for stock price prediction. So, we propose using machine-learning models to examine the relationship between EPI indicators and excess returns.
In contrast to earlier research into fundamental analysis, this study’s methodology differs in terms of the convergence approach between financial statement analysis and machine learning. For financial statement analysis, machine learning can capture the complicated relationship between inputs and outputs more effectively than linear models. In this integrated approach, we can consider more sophisticated indicators and recognize interpretive power with a theoretical basis. The indicators come from research into the prediction of future earnings (Nissim and Penman, 2001, Penman and Zhang, 2006, Wahlen and Wieland, 2011, Song et al., 2020). To validate the model for predicting the rise in stock returns, we compared the difference in abnormal returns on the shares most likely to increase in price with those most likely to decrease in price. We follow the hedge portfolio strategy used in the research of Holthausen and Larker (1992) and the three-factor model of Fama and French (1993) to evaluate the efficiency of the model. However, the forecast and return measurement period is revised to 3 months and 6 months to meet the research purpose of intermediate-term investment. In addition, we use ten machine learning techniques as predictive models of abnormal returns to directly predict the signs of stock returns. Then, we calculate the hedge portfolio returns on a long in predicted winners and short in predicted losers and then compute the four different measures of that portfolio: market-adjusted return, Jensen alpha, size-adjusted return, and Fama-French three-factor model return.
This study differs from previous efforts in that machine learning is utilized to examine the influence of financially guaranteed factors for mid-term investment. Using the models of machine learning, we examine whether fundamental analysis can help investors make rational decisions and demonstrated the usefulness of EPI-related information for predicting future returns. We also empirically test whether machine-learning models targeting intermediate-term investments can generate abnormal returns in practice.
The balance of this paper is organized as follows. Section 2 reviews related research and discusses the direction of the study. Section 3 describes our proposed strategy of investment stock selection, and Section 4 details the experimental results of our strategy. Section 5 presents conclusions and implications for future research.
Section snippets
Prediction of excess returns initiated by factor investment
A factor investment strategy is based on exploring factors that can generate excess profit in a market over the long-term. Numerous studies have attempted to identify which factor can explain value premium, targeting steady profits in the market. The earliest research primarily verifies a rational capital asset pricing model (CAPM) under the efficient market hypothesis, in which a stock price immediately reflects information (Fama, 1970). First, empirical analysis, which connects financial
Proposed approach
This section describes the proposed approach using machine-learning techniques. It consists of three steps from data preparation to stock selection for investing, as summarized in Fig. 1.
Empirical setting
The analysis period is from 2013 to 2019, and we collect quarterly financial data for 1,878 companies listed at the time on the Korea Composite Stock Price Index (KOSPI) and Korea Securities Dealers Automated Quotation (KOSDAQ). There are 690 companies on the KOSPI and 1,188 in the KOSDAQ market. The data cover a wide range of industries (see Table 4) and are processed into 15 variables.
We collect data for 28,399 observations and partition these data into a training sample (of 20,413
Discussion
We confirm that the combination of financial statement analysis and machine learning can be effective. Beside the experimental result, we try to validate the usefulness of the EPI indicators and the profitability of the proposed investment strategy.
Conclusions
We examine the usefulness of machine-learning techniques in intermediate-term investment strategies, combining these techniques with financial indicators that have been proven to predict future profits. We use machine learning to estimate complicated relationships between variables. This study confirms binary classifiers’ performance in predicting stock returns by performing the standard investment hedge portfolio test using the top and bottom 20 % of available observations. As a result, most
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
References (72)
- et al.
Data mining applications in accounting: A review of the literature and organizing framework
Int. J. Account. Inf. Syst.
(2017) The earnings-price anomaly
Journal of Accounting and Economics
(1992)- et al.
Evaluating multiple classifiers for stock price direction prediction
Expert Syst. Appl.
(2015) The relationship between return and market value of common stocks
Journal of financial economics
(1981)- et al.
Stock market movement forecast: A Systematic review
Expert Syst. Appl.
(2020) - et al.
Common risk factors in the returns on stocks and bonds
J. financ. econ.
(1993) - et al.
A five-factor asset pricing model
J. financ. econ.
(2015) - et al.
Framewise phoneme classification with bidirectional LSTM and other neural network architectures
Neural Networks
(2005) Fundamental analysis and subsequent stock returns
Journal of Accounting and Economics
(1992)- et al.
The prediction of stock returns using financial statement information
Journal of accounting and economics
(1992)