当前位置: X-MOL 学术Front. Neurosci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Hand-Gesture Recognition Based on EMG and Event-Based Camera Sensor Fusion: A Benchmark in Neuromorphic Computing
Frontiers in Neuroscience ( IF 3.2 ) Pub Date : 2020-08-05 , DOI: 10.3389/fnins.2020.00637
Enea Ceolini 1 , Charlotte Frenkel 1, 2 , Sumit Bam Shrestha 3 , Gemma Taverni 1 , Lyes Khacef 4 , Melika Payvand 1 , Elisa Donati 1
Affiliation  

Hand gestures are a form of non-verbal communication used by individuals in conjunction with speech to communicate. Nowadays, with the increasing use of technology, hand-gesture recognition is considered to be an important aspect of Human-Machine Interaction (HMI), allowing the machine to capture and interpret the user's intent and to respond accordingly. The ability to discriminate between human gestures can help in several applications, such as assisted living, healthcare, neuro-rehabilitation, and sports. Recently, multi-sensor data fusion mechanisms have been investigated to improve discrimination accuracy. In this paper, we present a sensor fusion framework that integrates complementary systems: the electromyography (EMG) signal from muscles and visual information. This multi-sensor approach, while improving accuracy and robustness, introduces the disadvantage of high computational cost, which grows exponentially with the number of sensors and the number of measurements. Furthermore, this huge amount of data to process can affect the classification latency which can be crucial in real-case scenarios, such as prosthetic control. Neuromorphic technologies can be deployed to overcome these limitations since they allow real-time processing in parallel at low power consumption. In this paper, we present a fully neuromorphic sensor fusion approach for hand-gesture recognition comprised of an event-based vision sensor and three different neuromorphic processors. In particular, we used the event-based camera, called DVS, and two neuromorphic platforms, Loihi and ODIN + MorphIC. The EMG signals were recorded using traditional electrodes and then converted into spikes to be fed into the chips. We collected a dataset of five gestures from sign language where visual and electromyography signals are synchronized. We compared a fully neuromorphic approach to a baseline implemented using traditional machine learning approaches on a portable GPU system. According to the chip's constraints, we designed specific spiking neural networks (SNNs) for sensor fusion that showed classification accuracy comparable to the software baseline. These neuromorphic alternatives have increased inference time, between 20 and 40%, with respect to the GPU system but have a significantly smaller energy-delay product (EDP) which makes them between 30× and 600× more efficient. The proposed work represents a new benchmark that moves neuromorphic computing toward a real-world scenario.

中文翻译:


基于 EMG 和基于事件的相机传感器融合的手势识别:神经形态计算的基准



手势是个人与言语结合使用的一种非语言交流形式。如今,随着技术的不断使用,手势识别被认为是人机交互(HMI)的一个重要方面,它允许机器捕获和解释用户的意图并做出相应的响应。区分人类手势的能力可以在多种应用中发挥作用,例如辅助生活、医疗保健、神经康复和运动。最近,已经研究了多传感器数据融合机制以提高辨别精度。在本文中,我们提出了一个集成互补系统的传感器融合框架:来自肌肉的肌电图(EMG)信号和视觉信息。这种多传感器方法在提高准确性和鲁棒性的同时,也带来了计算成本高的缺点,计算成本随着传感器数量和测量次数呈指数增长。此外,要处理的大量数据可能会影响分类延迟,这在真实场景(例如假肢控制)中至关重要。可以部署神经形态技术来克服这些限制,因为它们允许以低功耗并行实时处理。在本文中,我们提出了一种用于手势识别的完全神经形态传感器融合方法,由基于事件的视觉传感器和三个不同的神经形态处理器组成。特别是,我们使用了基于事件的相机(称为 DVS)和两个神经拟态平台(Loihi 和 ODIN + MorphIC)。使用传统电极记录肌电图信号,然后转换成尖峰信号输入芯片。 我们收集了来自手语的五个手势的数据集,其中视觉和肌电信号是同步的。我们将完全神经拟态方法与在便携式 GPU 系统上使用传统机器学习方法实现的基线进行了比较。根据芯片的限制,我们设计了用于传感器融合的特定尖峰神经网络(SNN),其分类精度与软件基线相当。相对于 GPU 系统,这些神经形态替代方案增加了 20% 到 40% 的推理时间,但能量延迟积 (EDP) 明显更小,这使得它们的效率提高了 30 倍到 600 倍。拟议的工作代表了一个新的基准,将神经形态计算推向现实世界场景。
更新日期:2020-08-05
down
wechat
bug