Using Automatic Speech Recognition to Assess Spoken Responses to Cognitive Tests of Semantic Verbal Fluency.,Speech Communication

当前位置： X-MOL 学术 › Speech Commun. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Using Automatic Speech Recognition to Assess Spoken Responses to Cognitive Tests of Semantic Verbal Fluency.
Speech Communication ( IF 2.4 ) Pub Date : 2015-09-28 , DOI: 10.1016/j.specom.2015.09.010
Serguei V S Pakhomov ₁ , Susan E Marino ₁ , Sarah Banks ₂ , Charles Bernick ₂

Affiliation

Cognitive tests of verbal fluency (VF) consist of verbalizing as many words as possible in one minute that either start with a specific letter of the alphabet or belong to a specific semantic category. These tests are widely used in neurological, psychiatric, mental health, and school settings and their validity for clinical applications has been extensively demonstrated. However, VF tests are currently administered and scored manually making them too cumbersome to use, particularly for longitudinal cognitive monitoring in large populations. The objective of the current study was to determine if automatic speech recognition (ASR) could be used for computerized administration and scoring of VF tests. We examined established techniques for constraining language modeling to a predefined vocabulary from a specific semantic category (e.g., animals). We also experimented with post-processing ASR output with confidence scoring, as well as with using speaker adaptation to improve automated VF scoring. Audio responses to a VF task were collected from 38 novice and experienced professional fighters (boxing and mixed martial arts) participating in a longitudinal study of effects of repetitive head trauma on brain function. Word error rate, correlation with manual word count and distance from manual word count were used to compare ASR-based approaches to scoring to each other and to the manually scored reference standard. Our study’s results show that responses to the VF task contain a large number of extraneous utterances and noise that lead to relatively poor baseline ASR performance. However, we also found that speaker adaptation combined with confidence scoring significantly improves all three metrics and can enable use of ASR for reliable estimates of the traditional manual VF scores.

中文翻译：

使用自动语音识别来评估对语音口语流利度的认知测试的口头反应。

口语流利度（VF）的认知测试包括在一分钟内对尽可能多的单词进行口头表达，这些单词要么以特定的字母开头，要么属于特定的语义类别。这些测试广泛用于神经，精神，心理健康和学校环境，并且已经广泛证明了其在临床应用中的有效性。但是，目前进行VF测试并对其进行手动评分，使它们过于笨拙而无法使用，尤其是对于大型人群的纵向认知监测而言。当前研究的目的是确定自动语音识别（ASR）是否可以用于VF测试的计算机化管理和评分。我们研究了用于将语言建模限制为来自特定语义类别（例如动物）的预定义词汇表的已建立技术。我们还对具有置信度评分的后处理ASR输出进行了实验，并使用扬声器自适应来改善自动VF评分。从38个新手和经验丰富的专业战士（拳击和混合武术）中收集了对VF任务的音频响应，这些战士参加了关于重复性头部创伤对脑功能的影响的纵向研究。使用单词错误率，与手动单词计数的相关性以及与手动单词计数的距离来比较基于ASR的评分方法和彼此评分以及与手动评分参考标准的比较。我们的研究结果表明，对VF任务的响应包含大量无关的发声和噪声，导致相对较差的基准ASR性能。然而，

更新日期：2015-09-28

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11