Decoding imagined speech from EEG signals using hybrid-scale spatial-temporal dilated convolution network,Journal of Neural Engineering

当前位置： X-MOL 学术 › J. Neural Eng. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Decoding imagined speech from EEG signals using hybrid-scale spatial-temporal dilated convolution network
Journal of Neural Engineering ( IF 3.7 ) Pub Date : 2021-08-11 , DOI: 10.1088/1741-2552/ac13c0
Fu Li ₁ , Weibing Chao ₁ , Yang Li ₁ , Boxun Fu ₁ , Youshuo Ji ₁ , Hao Wu ₁ , Guangming Shi ₁

Affiliation

Objective. Directly decoding imagined speech from electroencephalogram (EEG) signals has attracted much interest in brain–computer interface applications, because it provides a natural and intuitive communication method for locked-in patients. Several methods have been applied to imagined speech decoding, but how to construct spatial-temporal dependencies and capture long-range contextual cues in EEG signals to better decode imagined speech should be considered. Approach. In this study, we propose a novel model called hybrid-scale spatial-temporal dilated convolution network (HS-STDCN) for EEG-based imagined speech recognition. HS-STDCN integrates feature learning from temporal and spatial information into a unified end-to-end model. To characterize the temporal dependencies of the EEG sequences, we adopted a hybrid-scale temporal convolution layer to capture temporal information at multiple levels. A depthwise spatial convolution layer was then designed to construct intrinsic spatial relationships of EEG electrodes, which can produce a spatial-temporal representation of the input EEG data. Based on the spatial-temporal representation, dilated convolution layers were further employed to learn long-range discriminative features for the final classification. Main results. To evaluate the proposed method, we compared the HS-STDCN with other existing methods on our collected dataset. The HS-STDCN achieved an averaged classification accuracy of 54.31% for decoding eight imagined words, which is significantly better than other methods at a significance level of 0.05. Significance. The proposed HS-STDCN model provided an effective approach to make use of both the temporal and spatial dependencies of the input EEG signals for imagined speech recognition. We also visualized the word semantic differences to analyze the impact of word semantics on imagined speech recognition, investigated the important regions in the decoding process, and explored the use of fewer electrodes to achieve comparable performance.

中文翻译：

使用混合尺度时空扩张卷积网络从 EEG 信号解码想象语音

客观的。从脑电图 (EEG) 信号中直接解码想象的语音已经引起了脑机接口应用的极大兴趣，因为它为锁定的患者提供了一种自然而直观的交流方法。几种方法已应用于想象语音解码，但应考虑如何构建时空依赖性并捕获 EEG 信号中的远程上下文线索以更好地解码想象语音。方法。在这项研究中，我们提出了一种称为混合尺度时空扩张卷积网络 (HS-STDCN) 的新模型，用于基于 EEG 的想象语音识别。HS-STDCN 将来自时间和空间信息的特征学习集成到一个统一的端到端模型中。为了表征 EEG 序列的时间依赖性，我们采用了混合尺度时间卷积层来捕获多个级别的时间信息。然后设计了一个深度空间卷积层来构建 EEG 电极的内在空间关系，这可以产生输入 EEG 数据的时空表示。基于时空表示，进一步使用扩张卷积层来学习最终分类的远程判别特征。主要结果。为了评估所提出的方法，我们将 HS-STDCN 与我们收集的数据集上的其他现有方法进行了比较。HS-STDCN 解码八个想象词的平均分类准确率为 54.31%，在 0.05 的显着性水平上明显优于其他方法。意义。提出的 HS-STDCN 模型提供了一种有效的方法来利用输入 EEG 信号的时间和空间依赖性进行想象语音识别。我们还将单词语义差异可视化，以分析单词语义对想象语音识别的影响，调查解码过程中的重要区域，并探索使用更少的电极来实现可比性能。

更新日期：2021-08-11

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11