当前位置: X-MOL 学术arXiv.cs.LG › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Gaze-contingent decoding of human navigation intention on an autonomous wheelchair platform
arXiv - CS - Machine Learning Pub Date : 2021-03-04 , DOI: arxiv-2103.03072
Mahendran Subramanian, Suhyung Park, Pavel Orlov, Ali Shafti, A. Aldo Faisal

We have pioneered the Where-You-Look-Is Where-You-Go approach to controlling mobility platforms by decoding how the user looks at the environment to understand where they want to navigate their mobility device. However, many natural eye-movements are not relevant for action intention decoding, only some are, which places a challenge on decoding, the so-called Midas Touch Problem. Here, we present a new solution, consisting of 1. deep computer vision to understand what object a user is looking at in their field of view, with 2. an analysis of where on the object's bounding box the user is looking, to 3. use a simple machine learning classifier to determine whether the overt visual attention on the object is predictive of a navigation intention to that object. Our decoding system ultimately determines whether the user wants to drive to e.g., a door or just looks at it. Crucially, we find that when users look at an object and imagine they were moving towards it, the resulting eye-movements from this motor imagery (akin to neural interfaces) remain decodable. Once a driving intention and thus also the location is detected our system instructs our autonomous wheelchair platform, the A.Eye-Drive, to navigate to the desired object while avoiding static and moving obstacles. Thus, for navigation purposes, we have realised a cognitive-level human interface, as it requires the user only to cognitively interact with the desired goal, not to continuously steer their wheelchair to the target (low-level human interfacing).

中文翻译:

自主轮椅平台上人眼导航意图的视线解码

我们通过解码用户如何看待环境以了解他们想要在哪里导航其移动设备的方式,率先开发了“随处可见”即“随处可见”方法来控制移动平台。但是,许多自然的眼动与动作意图解码无关,只有一些自然的眼动与所谓的“ Midas触摸问题”有关,这给解码带来了挑战。在这里,我们提出了一种新的解决方案,其中包括:1.深入的计算机视觉,以了解用户在其视野中正在查看的对象,2.分析用户在对象边界框上的什么位置,3。使用简单的机器学习分类器来确定对对象的明显视觉关注是否可以预测到该对象的导航意图。我们的解码系统最终确定用户是否要开车例如 一扇门或只是看着它。至关重要的是,我们发现,当用户看着一个物体并想象他们正朝着该物体运动时,从该运动图像(类似于神经界面)产生的眼睛运动仍可解码。一旦检测到驾驶意图并因此检测到位置,我们的系统就会指示我们的自主轮椅平台A.Eye-Drive导航到所需对象,同时避免静态和移动障碍物。因此,出于导航目的,我们已经实现了认知级别的人机界面,因为它只需要用户与预期目标进行认知交互,而不是持续将轮椅推向目标(低级别人机界面)。由此运动图像产生的眼睛运动(类似于神经界面)仍可解码。一旦检测到驾驶意图并因此检测到位置,我们的系统就会指示我们的自主轮椅平台A.Eye-Drive导航到所需对象,同时避免静态和移动障碍物。因此,出于导航目的,我们已经实现了认知级别的人机界面,因为它只需要用户与预期目标进行认知交互,而不是持续将轮椅推向目标(低级别人机界面)。由此运动图像产生的眼睛运动(类似于神经界面)仍可解码。一旦检测到驾驶意图并因此检测到位置,我们的系统就会指示我们的自主轮椅平台A.Eye-Drive导航到所需对象,同时避免静态和移动障碍物。因此,出于导航目的,我们已经实现了认知级别的人机界面,因为它只需要用户与预期目标进行认知交互,而不是持续将轮椅推向目标(低级别人机界面)。
更新日期:2021-03-05
down
wechat
bug