当前位置: X-MOL 学术Lang. Resour. Eval. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Approaching terminological ambiguity in cross-disciplinary communication as a word sense induction task: a pilot study
Language Resources and Evaluation ( IF 1.7 ) Pub Date : 2019-04-12 , DOI: 10.1007/s10579-019-09455-7
Julie Mennes , Ted Pedersen , Els Lefever

Cross-disciplinary communication is often impeded by terminological ambiguity. Hence, cross-disciplinary teams would greatly benefit from using a language technology-based tool that allows for the (at least semi-) automated resolution of ambiguous terms. Although no such tool is readily available, an interesting theoretical outline of one does exist. The main obstacle for the concrete realization of this tool is the current lack of an effective method for the automatic detection of the different meanings of ambiguous terms across different disciplinary jargons. In this paper, we set up a pilot study to experimentally assess whether the word sense induction technique of ‘context clustering’, as implemented in the software package ‘SenseClusters’, might be a solution. More specifically, given several sets of sentences coming from a cross-disciplinary corpus containing a specific ambiguous term, we verify whether this technique can classify each sentence in accordance to the meaning of the ambiguous term in that sentence. For the experiments, we first compile a corpus that represents the disciplinary jargons involved in a project on Bone Tissue Engineering. Next, we conduct two series of experiments. The first series focuses on determining appropriate SenseClusters parameter settings using manually selected test data for the ambiguous target terms ‘matrix’ and ‘model’. The second series evaluates the actual performance of SenseClusters using randomly selected test data for an extended set of target terms. We observe that SenseClusters can successfully classify sentences from a cross-disciplinary corpus according to the meaning of the ambiguous term they contain. Hence, we argue that this implementation of context clustering shows potential as a method for the automatic detection of the meanings of ambiguous terms in cross-disciplinary communication.

中文翻译:

跨学科交流中的术语歧义作为词义归纳任务:一项试点研究

跨学科交流经常受到术语歧义的阻碍。因此,跨学科团队将从使用基于语言技术的工具中受益匪浅,该工具可以(至少半)自动解决歧义词。尽管尚无此类工具,但确实存在一个有趣的理论概述。该工具具体实现的主要障碍是当前缺乏一种自动检测跨学科术语的歧义术语的不同含义的有效方法。在本文中,我们进行了一项试验性研究,以实验评估软件包“ SenseClusters”中实现的“上下文聚类”的词义归纳技术是否可以解决。进一步来说,给定几套来自包含特定歧义词的跨学科语料库的句子,我们验证该技术是否可以根据该句子中歧义词的含义对每个句子进行分类。对于实验,我们首先编译一个语料库,该语料表述了参与骨组织工程项目的学科术语。接下来,我们进行两个系列的实验。第一个系列着重于使用针对模棱两可的目标术语“矩阵”和“模型”的手动选择的测试数据来确定适当的SenseClusters参数设置。第二个系列使用随机选择的测试数据对一组扩展的目标术语来评估SenseClusters的实际性能。我们观察到,SenseClusters可以根据它们所含歧义术语的含义,成功地将跨学科语料库中的句子分类。因此,我们认为上下文聚类的这种实现方式显示了潜力,可以作为一种自动检测跨学科交流中歧义术语含义的方法。
更新日期:2019-04-12
down
wechat
bug