当前位置: X-MOL 学术Int. J. Artif. Intell. Tools › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Persian Medical Question Answering System
International Journal on Artificial Intelligence Tools ( IF 1.0 ) Pub Date : 2020-09-30 , DOI: 10.1142/s0218213020500190
Hadi Veisi 1 , Hamed Fakour Shandi 1
Affiliation  

A question answering system is a type of information retrieval that takes a question from a user in natural language as the input and returns the best answer to it as the output. In this paper, a medical question answering system in the Persian language is designed and implemented. During this research, a dataset of diseases and drugs is collected and structured. The proposed system includes three main modules: question processing, document retrieval, and answer extraction. For the question processing module, a sequential architecture is designed which retrieves the main concept of a question by using different components. In these components, rule-based methods, natural language processing, and dictionary-based techniques are used. In the document retrieval module, the documents are indexed and searched using the Lucene library. The retrieved documents are ranked using similarity detection algorithms and the highest-ranked document is selected to be used by the answer extraction module. This module is responsible for extracting the most relevant section of the text in the retrieved document. During this research, different customized language processing tools such as part of speech tagger and lemmatizer are also developed for Persian. Evaluation results show that this system performs well for answering different questions about diseases and drugs. The accuracy of the system for 500 sample questions is 83.6%.

中文翻译:

波斯医学问答系统

问答系统是一种信息检索类型,它以自然语言的用户问题作为输入,并将最佳答案作为输出返回给它。本文设计并实现了一个波斯语医学问答系统。在这项研究中,收集并构建了疾病和药物的数据集。所提出的系统包括三个主要模块:问题处理、文档检索和答案提取。对于问题处理模块,设计了一个顺序架构,通过使用不同的组件来检索问题的主要概念。在这些组件中,使用了基于规则的方法、自然语言处理和基于字典的技术。在文档检索模块中,使用 Lucene 库对文档进行索引和搜索。使用相似性检测算法对检索到的文档进行排序,并选择排名最高的文档供答案提取模块使用。该模块负责在检索到的文档中提取文本中最相关的部分。在这项研究中,还为波斯语开发了不同的定制语言处理工具,例如词性标注器和词形还原器。评估结果表明,该系统在回答有关疾病和药物的不同问题方面表现良好。系统对 500 道样题的准确率为 83.6%。在这项研究中,还为波斯语开发了不同的定制语言处理工具,例如词性标注器和词形还原器。评估结果表明,该系统在回答有关疾病和药物的不同问题方面表现良好。系统对 500 道样题的准确率为 83.6%。在这项研究中,还为波斯语开发了不同的定制语言处理工具,例如词性标注器和词形还原器。评估结果表明,该系统在回答有关疾病和药物的不同问题方面表现良好。系统对 500 道样题的准确率为 83.6%。
更新日期:2020-09-30
down
wechat
bug