Deep reinforcement learning for long‐term pavement maintenance planning,Computer-Aided Civil and Infrastructure Engineering

当前位置： X-MOL 学术 › Comput. Aided Civ. Infrastruct. Eng. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Deep reinforcement learning for long‐term pavement maintenance planning
Computer-Aided Civil and Infrastructure Engineering ( IF 8.5 ) Pub Date : 2020-05-20 , DOI: 10.1111/mice.12558
Linyi Yao ₁ , Qiao Dong ₁ , Jiwang Jiang ₁ , Fujian Ni ₁

Affiliation

Inappropriate maintenance and rehabilitation strategies cause many problems such as maintenance budget waste, ineffective pavement distress treatments, and so forth. A method based on a machine learning algorithm called deep reinforcement learning (DRL) was developed in this presented research in order to learn better maintenance strategies that maximize the long‐term cost‐effectiveness in maintenance decision‐making through trial and error. In this method, each single‐lane pavement segment can have different treatments, and the long‐term maintenance cost‐effectiveness of the entire section is treated as the optimization goal. In the DRL algorithm, states are embodied by 42 parameters involving the pavement structures and materials, traffic loads, maintenance records, pavement conditions, and so forth. Specific treatments as well as do‐nothing are the actions. The reward is defined as the increased or decreased cost‐effectiveness after taking corresponding actions. Two expressways, the Ningchang and Zhenli expressways, were selected for a case study. The results show that the DRL model is capable of learning a better strategy to improve the long‐term maintenance cost‐effectiveness. By implementing the optimized maintenance strategies produced by the developed model, the pavement conditions can be controlled in an acceptable range.

中文翻译：

深度加固学习，以制定长期的路面养护计划

不适当的维护和修复策略会导致许多问题，例如维护预算浪费，无效的路面遇险处理等等。在本研究中，开发了一种基于机器学习算法的方法，称为深度强化学习（DRL），目的是学习更好的维护策略，从而通过反复试验最大化维护决策的长期成本效益。在这种方法中，每个单车道路面段可以有不同的处理方式，并且整个路段的长期维护成本效益被视为优化目标。在DRL算法中，状态由涉及路面结构和材料，交通负荷，维护记录，路面状况等的42个参数体现。采取任何特定措施以及什么都不做。奖励的定义是采取相应措施后提高或降低的成本效益。选择了宁昌和镇里两个高速公路进行案例研究。结果表明，DRL模型能够学习更好的策略来提高长期维护成本效益。通过实施由开发模型产生的优化维护策略，可以将路面状况控制在可接受的范围内。结果表明，DRL模型能够学习更好的策略来提高长期维护成本效益。通过实施由开发模型产生的优化维护策略，可以将路面状况控制在可接受的范围内。结果表明，DRL模型能够学习更好的策略来提高长期维护成本效益。通过实施由开发模型产生的优化维护策略，可以将路面状况控制在可接受的范围内。

更新日期：2020-05-20

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11