当前位置: X-MOL 学术Cognit. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Echo State Networks and Long Short-Term Memory for Continuous Gesture Recognition: a Comparative Study
Cognitive Computation ( IF 5.4 ) Pub Date : 2020-10-07 , DOI: 10.1007/s12559-020-09754-0
Doreen Jirak , Stephan Tietz , Hassan Ali , Stefan Wermter

Recent developments of sensors that allow tracking of human movements and gestures enable rapid progress of applications in domains like medical rehabilitation or robotic control. Especially the inertial measurement unit (IMU) is an excellent device for real-time scenarios as it rapidly delivers data input. Therefore, a computational model must be able to learn gesture sequences in a fast yet robust way. We recently introduced an echo state network (ESN) framework for continuous gesture recognition (Tietz et al., 2019) including novel approaches for gesture spotting, i.e., the automatic detection of the start and end phase of a gesture. Although our results showed good classification performance, we identified significant factors which also negatively impact the performance like subgestures and gesture variability. To address these issues, we include experiments with Long Short-Term Memory (LSTM) networks, which is a state-of-the-art model for sequence processing, to compare the obtained results with our framework and to evaluate their robustness regarding pitfalls in the recognition process. In this study, we analyze the two conceptually different approaches processing continuous, variable-length gesture sequences, which shows interesting results comparing the distinct gesture accomplishments. In addition, our results demonstrate that our ESN framework achieves comparably good performance as the LSTM network but has significantly lower training times. We conclude from the present work that ESNs are viable models for continuous gesture recognition delivering reasonable performance for applications requiring real-time performance as in robotic or rehabilitation tasks. From our discussion of this comparative study, we suggest prospective improvements on both the experimental and network architecture level.



中文翻译:

连续手势识别的回声状态网络和长时记忆:一项比较研究

允许跟踪人类运动和手势的传感器的最新发展使诸如医学康复或机器人控制等领域的应用得以快速发展。惯性测量单元(IMU)尤其适用于实时情况,因为它可以快速提供数据输入。因此,计算模型必须能够以快速而健壮的方式学习手势序列。我们最近推出了用于连续手势识别的回声状态网络(ESN)框架(Tietz等人,2019),其中包括用于手势识别的新颖方法,即自动检测手势的开始和结束阶段。尽管我们的结果显示出良好的分类性能,但我们发现了也会对性能产生负面影响的重要因素,例如子手势和手势变异性。为了解决这些问题,我们包括使用长短期记忆(LSTM)网络的实验,该网络是序列处理的最新模型,用于将获得的结果与我们的框架进行比较,并评估它们在识别过程中关于陷阱的鲁棒性。在这项研究中,我们分析了两种在概念上不同的处理连续可变长度手势序列的方法,这些方法显示了比较不同手势成就的有趣结果。此外,我们的结果表明,我们的ESN框架与LSTM网络相比具有相当好的性能,但训练时间却大大减少。从当前的工作中我们得出结论,ESN是连续手势识别的可行模型,可为需要实时性能的应用(如机器人或康复任务)提供合理的性能。

更新日期:2020-10-07
down
wechat
bug