当前位置: X-MOL 学术Neural Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Efficient Actor-Critic Reinforcement Learning with Embodiment of Muscle Tone for Posture Stabilization of the Human Arm
Neural Computation ( IF 2.9 ) Pub Date : 2021-01-01 , DOI: 10.1162/neco_a_01333
Masami Iwamoto 1 , Daichi Kato 1
Affiliation  

This letter proposes a new idea to improve learning efficiency in reinforcement learning (RL) with the actor-critic method used as a muscle controller for posture stabilization of the human arm. Actor-critic RL (ACRL) is used for simulations to realize posture controls in humans or robots using muscle tension control. However, it requires very high computational costs to acquire a better muscle control policy for desirable postures. For efficient ACRL, we focused on embodiment that is supposed to potentially achieve efficient controls in research fields of artificial intelligence or robotics. According to the neurophysiology of motion control obtained from experimental studies using animals or humans, the pedunculopontine tegmental nucleus (PPTn) induces muscle tone suppression, and the midbrain locomotor region (MLR) induces muscle tone promotion. PPTn and MLR modulate the activation levels of mutually antagonizing muscles such as flexors and extensors in a process through which control signals are translated from the substantia nigra reticulata to the brain stem. Therefore, we hypothesized that the PPTn and MLR could control muscle tone, that is, the maximum values of activation levels of mutually antagonizing muscles using different sigmoidal functions for each muscle; then we introduced antagonism function models (AFMs) of PPTn and MLR for individual muscles, incorporating the hypothesis into the process to determine the activation level of each muscle based on the output of the actor in ACRL. ACRL with AFMs representing the embodiment of muscle tone successfully achieved posture stabilization in five joint motions of the right arm of a human adult male under gravity in predetermined target angles at an earlier period of learning than the learning methods without AFMs. The results obtained from this study suggest that the introduction of embodiment of muscle tone can enhance learning efficiency in posture stabilization disorders of humans or humanoid robots.

中文翻译:

有效的演员-评论家强化学习与肌肉张力的体现,用于人体手臂的姿势稳定

这封信提出了一个新的想法,以提高强化学习 (RL) 的学习效率,将 actor-critic 方法用作肌肉控制器以稳定人体手臂的姿势。Actor-Critic RL (ACRL) 用于模拟以使用肌肉张力控制实现人类或机器人的姿势控制。然而,它需要非常高的计算成本才能为理想的姿势获得更好的肌肉控制策略。对于高效的 ACRL,我们专注于有望在人工智能或机器人研究领域实现高效控制的实施例。根据从动物或人类的实验研究中获得的运动控制的神经生理学,桥脚被盖核 (PPTn) 诱导肌张力抑制,中脑运动区 (MLR) 诱导肌张力提升。PPTn 和 MLR 调节相互拮抗肌肉(如屈肌和伸肌)的激活水平,在该过程中控制信号从黑质网状体传递到脑干。因此,我们假设 PPTn 和 MLR 可以控制肌张力,即对每块肌肉使用不同的 sigmoidal 函数的相互拮抗肌肉的激活水平的最大值;然后我们为单个肌肉引入了 PPTn 和 MLR 的拮抗功能模型 (AFM),将假设结合到过程中,以根据 ACRL 中 actor 的输出来确定每块肌肉的激活水平。与没有 AFM 的学习方法相比,具有 AFM 的 ACRL 代表了肌肉张力的体现,在预定的目标角度下,在重力作用下,人类成年男性右臂的五个关节运动成功地实现了姿势稳定,在更早的学习阶段。从这项研究中获得的结果表明,引入肌肉张力的体现可以提高人类或类人机器人姿势稳定障碍的学习效率。
更新日期:2021-01-01
down
wechat
bug