当前位置: X-MOL 学术arXiv.cs.IR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Automatic Extraction of Agriculture Terms from Domain Text: A Survey of Tools and Techniques
arXiv - CS - Information Retrieval Pub Date : 2020-09-24 , DOI: arxiv-2009.11796
Niladri Chatterjee, Neha Kaushik

Agriculture is a key component in any country's development. Domain-specific knowledge resources serve to gain insight into the domain. Existing knowledge resources such as AGROVOC and NAL Thesaurus are developed and maintained by the domain experts. Population of terms into these knowledge resources can be automated by using automatic term extraction tools for processing unstructured agricultural text. Automatic term extraction is also a key component in many semantic web applications, such as ontology creation, recommendation systems, sentiment classification, query expansion among others. The primary goal of an automatic term extraction system is to maximize the number of valid terms and minimize the number of invalid terms extracted from the input set of documents. Despite its importance in various applications, the availability of online tools for the said purpose is rather limited. Moreover, the performance of the most popular ones among them varies significantly. As a consequence, selection of the right term extraction tool is perceived as a serious problem for different knowledge-based applications. This paper presents an analysis of three commonly used term extraction tools, viz. RAKE, TerMine, TermRaider and compares their performance in terms of precision and recall, vis-a-vis RENT, a more recent term extractor developed by these authors for agriculture domain.

中文翻译:

从领域文本中自动提取农业术语:工具和技术概览

农业是任何国家发展的关键组成部分。特定领域的知识资源有助于深入了解该领域。现有的知识资源如 AGROVOC 和 NAL 词库由领域专家开发和维护。通过使用自动术语提取工具处理非结构化农业文本,可以自动将术语填充到这些知识资源中。自动术语提取也是许多语义 Web 应用程序中的关键组件,例如本体创建、推荐系统、情感分类、查询扩展等。自动术语提取系统的主要目标是最大化有效术语的数量并最小化从输入文档集中提取的无效术语的数量。尽管它在各种应用中很重要,用于上述目的的在线工具相当有限。此外,其中最受欢迎的性能差异很大。因此,对于不同的基于知识的应用程序,选择正确的术语提取工具被认为是一个严重的问题。本文分析了三种常用的术语提取工具,即。RAKE、TerMine、TermRaider 并比较了它们在精确度和召回率方面的表现,与这些作者为农业领域开发的最新术语提取器 RENT 相比。本文分析了三种常用的术语提取工具,即。RAKE、TerMine、TermRaider 并比较了它们在精确度和召回率方面的表现,与这些作者为农业领域开发的最新术语提取器 RENT 相比。本文分析了三种常用的术语提取工具,即。RAKE、TerMine、TermRaider 并比较了它们在精确度和召回率方面的表现,与这些作者为农业领域开发的最新术语提取器 RENT 相比。
更新日期:2020-09-25
down
wechat
bug