当前位置: X-MOL 学术International Journal of Corpus Linguistics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Do speech registers differ in the predictability of words?
International Journal of Corpus Linguistics ( IF 0.919 ) Pub Date : 2019-07-02 , DOI: 10.1075/ijcl.17062.ben
Martijn Bentum 1 , Louis ten Bosch 2 , Antal van den Bosch 3 , Mirjam Ernestus 2
Affiliation  

Abstract Previous research has demonstrated that language use can vary depending on the context of situation. The present paper extends this finding by comparing word predictability differences between 14 speech registers ranging from highly informal conversations to read-aloud books. We trained 14 statistical language models to compute register-specific word predictability and trained a register classifier on the perplexity score vector of the language models. The classifier distinguishes perfectly between samples from all speech registers and this result generalizes to unseen materials. We show that differences in vocabulary and sentence length cannot explain the speech register classifier’s performance. The combined results show that speech registers differ in word predictability.

中文翻译:

语音寄存器在单词的可预测性方面是否不同?

摘要 先前的研究表明,语言的使用可以根据情境的不同而有所不同。本论文通过比较从高度非正式的对话到朗读书籍的 14 个语音语域之间的单词可预测性差异来扩展这一发现。我们训练了 14 个统计语言模型来计算特定于语域的单词可预测性,并在语言模型的困惑度得分向量上训练了一个语域分类器。分类器可以完美地区分来自所有语音寄存器的样本,并且该结果可以推广到看不见的材料。我们表明词汇和句子长度的差异不能解释语音寄存器分类器的性能。综合结果表明,语音寄存器在单词可预测性方面存在差异。
更新日期:2019-07-02
down
wechat
bug