当前位置: X-MOL 学术Rob. Auton. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A controlled investigation of behaviorally-cloned deep neural network behaviors in an autonomous steering task
Robotics and Autonomous Systems ( IF 4.3 ) Pub Date : 2021-04-19 , DOI: 10.1016/j.robot.2021.103780
Michael Teti , William Edward Hahn , Shawn Martin , Christopher Teti , Elan Barenholtz

Imitation learning (IL) is a popular method used to train machine learning models that are capable of acting on their environment based on expert examples. Two types of IL models are inverse reinforcement learning (IRL) and behavioral cloning (BC). Models trained under IRL traditionally perform better than those trained under BC due to compounding covariate shift associated with the latter, which typically requires algorithms such as DAGGer to help compensate for this. More recently, however, deep learning architectures with increased generalization performance have been developed, which may help to alleviate the problem of compounding covariate shift and allow researchers to take advantage of the simplicity of BC. Despite these developments, recent studies on BC in sub-scale autonomous robots employ relatively primitive convolutional networks without such tools as batch normalization and skip connections, and it is difficult to judge their networks’ performance relative to others due to drastically different training and testing conditions. Here, we examine how an array of artificial neural networks, chosen to reflect more recent architectural choices available, behave in a highly controlled IL task – navigating around a small, indoor racetrack – upon being embedded in a sub-scale RC vehicle as an end-to-end steering system. For our main findings, we report the lap completion rate and path smoothness of each network under the exact same conditions as it controls the vehicle on the track. To supplement these findings, we also measure each network’s bias toward the distribution of the training actions and develop a method to highlight regions of a given input image that are deemed ‘important’ to a given network. We observe that most of the more recent neural networks perform reasonably well during testing, as opposed to the more primitive networks which did not perform as well. For these reasons and others, we identify VGG-16 and AlexNet – out of the networks tested here – as attractive candidate architectures for such tasks.



中文翻译:

自主转向任务中行为封闭的深度神经网络行为的受控研究

模仿学习(IL)是一种流行的方法,用于根据专家示例来训练能够在其环境中起作用的机器学习模型。IL模型有两种类型:逆向强化学习(IRL)和行为克隆(BC)。传统上,在IRL下训练的模型比在BC下训练的模型表现更好,这是由于与后者相关的复合协变量偏移,这通常需要诸如DAGGer之类的算法来帮助弥补这一点。但是,最近,开发了具有更高泛化性能的深度学习体系结构,这可能有助于减轻复合协变量移位的问题,并使研究人员能够利用BC的简单性。尽管有这些发展,最近在次规模自动机器人中对BC的研究使用了相对原始的卷积网络,而没有诸如批量归一化和跳过连接之类的工具,并且由于训练和测试条件的差异很大,因此很难判断它们的网络相对于其他网络的性能。在这里,我们研究了一系列人工神经网络如何被选择来反映最近可用的架构选择,它们如何在高度受控的IL任务(围绕小型室内赛车场导航)中运行,最终被嵌入到小型RC车辆中端到端转向系统。对于我们的主要发现,我们报告了每个网络在与控制赛道上的车辆完全相同的条件下的圈速完成率和路径平滑度。为了补充这些发现,我们还测量了每个网络对训练动作分布的偏见,并开发出一种方法来突出显示给定输入图像中被认为对给定网络“重要”的区域。我们观察到,大多数较新的神经网络在测试过程中的性能都相当好,这与较原始的神经网络在性能测试中的表现相反。由于这些原因及其他原因,我们将VGG-16和AlexNet(在此处测试的网络中)确定为此类任务的有吸引​​力的候选体系结构。

更新日期:2021-04-21
down
wechat
bug