当前位置: X-MOL 学术Poznan Studies in Contemporary Linguistics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Lexicalisation of Polish and English word combinations: an empirical study
Poznan Studies in Contemporary Linguistics ( IF 0.400 ) Pub Date : 2023-02-28 , DOI: 10.1515/psicl-2023-2002
Marek Maziarz 1 , Łukasz Grabowski 2 , Tadeusz Piotrowski 3 , Ewa Rudnicka 1 , Maciej Piasecki 1

One of the main research questions concerning multi-word expressions (MWEs) is which of them are transparent word combinations created ad hoc and which are multi-word lexical units (MWUs). In this paper, we use selected corpus-linguistic and machine-learning methods to determine which lexicalization criteria guide Polish and English lexicographers in deciding which MWEs (bigrams such as adjective+noun and noun+noun combinations) should be treated as lexical units recorded in dictionaries as MWUs. We analyzed two samples: MWEs extracted from Polish and English monolingual dictionaries, and those created by the annotators, and tested two custom-designed criteria, i.e., intuition and paraphrase, also by using statistical methods (measures of collocational strength: PMI and Jaccard). We revealed that Polish lexicographers have a tendency not to include compositional MWEs as lexical entries in their dictionaries and that the criteria of paraphrase and intuition are important for them: if MWEs are not clearly and unambiguously paraphrasable and compositional, then they are recorded in dictionaries. We found that in contrast to Polish lexicographers English lexicographers tend to record also compositional and partly compositional MWEs.



关于多词表达 (MWE) 的主要研究问题之一是它们中哪些是创建的透明词组合特别指定并且它们是多词词汇单位(MWU)。在本文中,我们使用选定的语料库语言和机器学习方法来确定哪些词汇化标准指导波兰语和英语词典编纂者决定哪些 MWE(形容词+名词和名词+名词组合等二元组)应被视为记录在字典作为 MWU。我们分析了两个样本:从波兰语和英语单语词典中提取的 MWE,以及由注释者创建的 MWE,并测试了两个定制设计的标准,即直觉和释义,也使用统计方法(搭配强度的测量:PMI 和 Jaccard) . 我们发现波兰词典编纂者倾向于不将组合 MWE 作为词典条目包括在内,释义和直觉的标准对他们很重要:如果 MWE 不能清楚明确地解释和组合,那么它们就会被记录在字典中。我们发现,与波兰词典编纂者相比,英国词典编纂者倾向于记录组合和部分组合的 MWE。