Development and structure of the VariaNTS corpus: A spoken Dutch corpus containing talker and linguistic variability,Speech Communication

当前位置： X-MOL 学术 › Speech Commun. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Development and structure of the VariaNTS corpus: A spoken Dutch corpus containing talker and linguistic variability
Speech Communication ( IF 2.4 ) Pub Date : 2020-12-28 , DOI: 10.1016/j.specom.2020.12.006
Floor Arts , Deniz Başkent , Terrin N. Tamati

Speech perception and spoken word recognition are not only affected by what is being said, but also by who is speaking. Currently, publicly available corpora of spoken Dutch do not offer a wide variety of linguistic materials produced by multiple talkers. The VariaNTS (Variatie in Nederlandse Taal en Sprekers) corpus is a Dutch spoken corpus that was developed to maximize both linguistic and talker variability. It contains 1000 items from 11 linguistic subcategories, recorded by 8 male and 8 female native speakers of standard Dutch. The corpus contains audio recordings, orthographic transcriptions, item-specific details such as word frequencies, neighborhood densities and phonotactic probabilities, and talker details. The VariaNTS corpus aims to provide new materials to be used for broad assessment of speech perception and word recognition in Dutch clinical and academic settings.

中文翻译：

VariaNTS语料库的发展和结构：荷兰语的语料库，包含说话者和语言变异性

言语感知和口语单词识别不仅受到什么是正说，也受到谁在说话。当前，荷兰语的公开语料库不能提供由多个讲话者制作的多种语言材料。VariaNTS（Nederlandse Taal en Sprekers中的Variatie）语料库是荷兰语语料库，其开发目的是最大程度地提高语言和说话者的变异性。它包含11个语言子类别中的1000个项目，由8位以荷兰语为母语的标准荷兰语录制。语料库包含录音，拼字法转录，特定于项目的详细信息（例如单词频率，邻域密度和音位概率）以及说话者详细信息。VariaNTS语料库旨在提供新材料，以广泛评估荷兰临床和学术环境中的语音感知和单词识别能力。

更新日期：2021-01-10

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11