当前位置: X-MOL 学术arXiv.cs.SE › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
The effects of having lists of synonyms on the performance of Afaan Oromo Text Retrieval system
arXiv - CS - Software Engineering Pub Date : 2021-03-04 , DOI: arxiv-2103.02900
Isayas Wakgari Kelbessa

Obtaining relevant information from a collection of informational resources in Afaan Oromo is very important for Afaan Oromo speakers, developing a system that help users of Afaan Oromo is mandatory. That is why this study is envisioned to make possible retrieval of Afaan Oromo text documents by applying techniques of modern information retrieval system. In the developed Afaan Oromo prototype, Probabilistic approach was used as an information retrieval models and precision and recall measurement were used as the performance measurement or evaluation technique. Apache Solr was also used as an environmental programming language to achieve the evaluation goal. Afaan Oromo text retrieval is evaluated using 158 documents and 13 arbitrarily selected queries that can determine the effectiveness of retrieval using the precision-recall. The average result obtained by our evaluation before the addition of synonymy was 72.91% precision and 86.8% recall respectively. After the addition of synonymy, the value was changed to 71.39% average precision and 90.5% average recall. The F-measure for the evaluation before synonymy addition was 79.25% and after addition changed to 79.82%. The addition of synonymy improves the system performance by 0.57%. The study therefore, experimentally proves that the addition of the thesaurus system can improve the system performance. Spellchecking, pagination, hit highlighting and autosuggestion is also possible in the developed prototype for Afaan Oromo.

中文翻译:

具有同义词列表对Afaan Oromo文本检索系统的性能的影响

对于Afaan Oromo讲者来说,从Afaan Oromo的信息资源中获取相关信息非常重要,因此必须开发一种系统来帮助Afaan Oromo的用户。这就是为什么该研究被设想为通过应用现代信息检索系统的技术使Afaan Oromo文本文档的检索成为可能的原因。在已开发的Afaan Oromo原型中,概率方法用作信息检索模型,精度和召回率测量用作性能测量或评估技术。Apache Solr还被用作环境编程语言以实现评估目标。使用158个文档和13个任意选择的查询对Afaan Oromo文本检索进行评估,这些查询可以使用精确调用来确定检索的有效性。通过我们的评估,在添加同义词之前,我们获得的平均结果分别为72.91%的准确度和86.8%的查全率。添加同义词后,该值更改为71.39%的平均精度和90.5%的平均召回率。同义词添加之前的评估F值为79.25%,添加之后变为79.82%。同义词的添加使系统性能提高了0.57%。因此,该研究通过实验证明了同义词库系统的添加可以改善系统性能。在为Afaan Oromo开发的原型中,拼写检查,分页,击中突出显示和自动提示也是可能的。平均召回率为5%。同义词添加之前的评估F值为79.25%,添加之后变为79.82%。同义词的添加使系统性能提高了0.57%。因此,该研究通过实验证明了同义词库系统的添加可以改善系统性能。在为Afaan Oromo开发的原型中,拼写检查,分页,击中突出显示和自动提示也是可能的。平均召回率为5%。同义词添加之前的评估F值为79.25%,添加之后变为79.82%。同义词的添加使系统性能提高了0.57%。因此,该研究通过实验证明了同义词库系统的添加可以改善系统性能。在为Afaan Oromo开发的原型中,拼写检查,分页,击中突出显示和自动提示也是可能的。
更新日期:2021-03-05
down
wechat
bug