当前位置: X-MOL 学术Int. J. Lexicogr. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Corpus Too Small: Uses of Text Data in a Hupa-English Bilingual Dictionary
International Journal of Lexicography ( IF 0.652 ) Pub Date : 2021-03-02 , DOI: 10.1093/ijl/ecab006
Justin Spence 1
Affiliation  

Although corpus-driven methods have led to a revolution in the way lexicographers of some languages approach their work, text corpora for many less-studied languages are too small for such methods to be used reliably. Hupa, a Native American language of northwestern California, is one such language. Nonetheless, the Hupa Online Dictionary and Texts website relies heavily on its small text corpus to support development of the dictionary component. The corpus is especially important as a way to address Hupa’s complex and productive polysynthetic morphology, both derivational and inflectional, with words attested in the corpus providing the empirical basis for creating new entries and expanding the coverage of existing ones. It also provides a ready source of example sentences in context, figurative uses of language that might not come to light through elicitation, and aspects of linguistic variation that dictionary normalization tends to obscure. Thus, while corpus-driven lexicography may not be a realistic possibility at this point, corpus-based lexicography (Tognini-Bonelli 2001) is certainly within reach.

中文翻译:

语料库太小:Hupa-English 双语词典中文本数据的使用

尽管语料库驱动的方法已经导致某些语言的词典编纂者处理他们的工作的方式发生了革命,但许多研究较少的语言的文本语料库太小,无法可靠地使用这些方法。胡帕语是加利福尼亚西北部的一种美洲原住民语言,就是这样一种语言。尽管如此,Hupa 在线词典和文本网站严重依赖其小型文本语料库来支持词典组件的开发。语料库作为解决 Hupa 复杂且多产的多合成形态(包括派生和屈折)的一种方式尤为重要,语料库中证明的单词为创建新条目和扩大现有条目的覆盖范围提供了经验基础。它还提供了一个现成的上下文例句来源,语言的比喻性使用可能不会通过启发而暴露出来,以及字典规范化往往会掩盖的语言变异的各个方面。因此,虽然语料库驱动的词典编纂在这一点上可能不现实,但基于语料库的词典编纂(Tognini-Bonelli 2001)肯定是可以实现的。
更新日期:2021-03-02
down
wechat
bug