当前位置: X-MOL 学术arXiv.cs.RO › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
To Share or Not to Share? Performance Guarantees and the Asymmetric Nature of Cross-Robot Experience Transfer
arXiv - CS - Robotics Pub Date : 2020-06-29 , DOI: arxiv-2006.16126
Michael J. Sorocky, Siqi Zhou, and Angela P. Schoellig

In the robotics literature, experience transfer has been proposed in different learning-based control frameworks to minimize the costs and risks associated with training robots. While various works have shown the feasibility of transferring prior experience from a source robot to improve or accelerate the learning of a target robot, there are usually no guarantees that experience transfer improves the performance of the target robot. In practice, the efficacy of transferring experience is often not known until it is tested on physical robots. This trial-and-error approach can be extremely unsafe and inefficient. Building on our previous work, in this paper we consider an inverse module transfer learning framework, where the inverse module of a source robot system is transferred to a target robot system to improve its tracking performance on arbitrary trajectories. We derive a theoretical bound on the tracking error when a source inverse module is transferred to the target robot and propose a Bayesian-optimization-based algorithm to estimate this bound from data. We further highlight the asymmetric nature of cross-robot experience transfer that has often been neglected in the literature. We demonstrate our approach in quadrotor experiments and show that we can guarantee positive transfer on the target robot for tracking random periodic trajectories.

中文翻译:

分享还是不分享?性能保证和跨机器人经验转移的非对称性

在机器人学文献中,已经在不同的基于学习的控制框架中提出了经验转移,以最大限度地减少与训练机器人相关的成本和风险。虽然各种工作已经表明从源机器人转移先前经验以改善或加速目标机器人的学习的可行性,但通常不能保证经验转移会提高目标机器人的性能。在实践中,传递经验的功效通常只有在物理机器人上进行测试后才能知道。这种反复试验的方法可能非常不安全且效率低下。在我们之前的工作的基础上,在本文中,我们考虑了一个逆模块迁移学习框架,其中源机器人系统的逆模块被转移到目标机器人系统,以提高其对任意轨迹的跟踪性能。当源逆模块转移到目标机器人时,我们推导出了跟踪误差的理论界限,并提出了一种基于贝叶斯优化的算法来从数据中估计这个界限。我们进一步强调了跨机器人经验转移的不对称性质,这在文献中经常被忽视。我们在四旋翼飞行器实验中展示了我们的方法,并表明我们可以保证目标机器人上的正转移以跟踪随机周期性轨迹。当源逆模块转移到目标机器人时,我们推导出了跟踪误差的理论界限,并提出了一种基于贝叶斯优化的算法来从数据中估计这个界限。我们进一步强调了跨机器人经验转移的不对称性质,这在文献中经常被忽视。我们在四旋翼飞行器实验中展示了我们的方法,并表明我们可以保证目标机器人上的正转移以跟踪随机周期性轨迹。当源逆模块转移到目标机器人时,我们推导出了跟踪误差的理论界限,并提出了一种基于贝叶斯优化的算法来从数据中估计这个界限。我们进一步强调了跨机器人经验转移的不对称性质,这在文献中经常被忽视。我们在四旋翼飞行器实验中展示了我们的方法,并表明我们可以保证目标机器人上的正转移以跟踪随机周期性轨迹。
更新日期:2020-06-30
down
wechat
bug