Adaptive stock trading strategies with deep reinforcement learning methods,Information Sciences

当前位置： X-MOL 学术 › Inform. Sci. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Adaptive stock trading strategies with deep reinforcement learning methods
Information Sciences ( IF 8.1 ) Pub Date : 2020-06-13 , DOI: 10.1016/j.ins.2020.05.066
Xing Wu , Haolei Chen , Jianjia Wang , Luigi Troiano , Vincenzo Loia , Hamido Fujita

The increasing complexity and dynamical property in stock markets are key challenges of the financial industry, in which inflexible trading strategies designed by experienced financial practitioners fail to achieve satisfactory performance in all market conditions. To meet this challenge, adaptive stock trading strategies with deep reinforcement learning methods are proposed. For the time-series nature of stock market data, the Gated Recurrent Unit (GRU) is applied to extract informative financial features, which can represent the intrinsic characteristics of the stock market for adaptive trading decisions. Furthermore, with the tailored design of state and action spaces, two trading strategies with reinforcement learning methods are proposed as GDQN (Gated Deep Q-learning trading strategy) and GDPG (Gated Deterministic Policy Gradient trading strategy). To verify the robustness and effectiveness of GDQN and GDPG, they are tested both in the trending and in the volatile stock market from different countries. Experimental results show that the proposed GDQN and GDPG not only outperform the Turtle trading strategy but also achieve more stable returns than a state-of-the-art direct reinforcement learning method, DRL trading strategy, in the volatile stock market. As far as the GDQN and the GDPG are compared, experimental results demonstrate that the GDPG with an actor-critic framework is more stable than the GDQN with a critic-only framework in the ever-evolving stock market.

中文翻译：

具有深度强化学习方法的自适应股票交易策略

股票市场不断增加的复杂性和动态属性是金融行业的主要挑战，在这些挑战中，由经验丰富的金融从业人员设计的僵化的交易策略无法在所有市场条件下均取得令人满意的表现。为了应对这一挑战，提出了具有深度强化学习方法的自适应股票交易策略。对于股票市场数据的时间序列性质，应用门控循环单元（GRU）提取信息丰富的财务特征，这些特征可以表示股票市场的固有特征，以进行自适应交易决策。此外，通过量身定制的状态和动作空间设计，提出了两种具有强化学习方法的交易策略，即GDQN（门控深度Q学习交易策略）和GDPG（门控确定性策略梯度交易策略）。为了验证GDQN和GDPG的鲁棒性和有效性，已在来自不同国家的趋势和动荡的股票市场中对它们进行了测试。实验结果表明，在动荡的股票市场中，提出的GDQN和GDPG不仅优于Turtle交易策略，而且比最先进的直接强化学习方法DRL交易策略获得更稳定的回报。就GDQN和GDPG进行的比较而言，实验结果表明，在不断发展的股票市场中，具有行为批评框架的GDPG比具有批判框架的GDQN更稳定。为了验证GDQN和GDPG的鲁棒性和有效性，已在来自不同国家的趋势和动荡的股票市场中对它们进行了测试。实验结果表明，在动荡的股票市场中，提出的GDQN和GDPG不仅优于Turtle交易策略，而且比最先进的直接强化学习方法DRL交易策略获得更稳定的回报。就GDQN和GDPG进行的比较而言，实验结果表明，在不断发展的股票市场中，具有行为批评框架的GDPG比具有批判框架的GDQN更稳定。为了验证GDQN和GDPG的鲁棒性和有效性，已在来自不同国家的趋势和动荡的股票市场中对它们进行了测试。实验结果表明，在动荡的股市中，提出的GDQN和GDPG不仅优于Turtle交易策略，而且比最先进的直接强化学习方法DRL交易策略获得更稳定的回报。就GDQN和GDPG进行的比较而言，实验结果表明，在不断发展的股票市场中，具有行为批评框架的GDPG比具有批判框架的GDQN更稳定。实验结果表明，在动荡的股票市场中，提出的GDQN和GDPG不仅优于Turtle交易策略，而且比最先进的直接强化学习方法DRL交易策略获得更稳定的回报。就GDQN和GDPG进行的比较而言，实验结果表明，在不断发展的股票市场中，具有行为批评框架的GDPG比具有批判框架的GDQN更稳定。实验结果表明，在动荡的股票市场中，提出的GDQN和GDPG不仅优于Turtle交易策略，而且比最先进的直接强化学习方法DRL交易策略获得更稳定的回报。就GDQN和GDPG进行的比较而言，实验结果表明，在不断发展的股票市场中，具有行为批评框架的GDPG比具有批判框架的GDQN更稳定。

更新日期：2020-06-13

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>