An adaptive term proximity based rocchio's model for clinical decision support retrieval.,BMC Medical Informatics and Decision Making

当前位置： X-MOL 学术 › BMC Med. Inform. Decis. Mak. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

An adaptive term proximity based rocchio's model for clinical decision support retrieval.
BMC Medical Informatics and Decision Making ( IF 3.3 ) Pub Date : 2019-12-12 , DOI: 10.1186/s12911-019-0986-6
Min Pan _{1,

2} , Yue Zhang ₃ , Qiang Zhu ₃ , Bo Sun ₁ , Tingting He ₃ , Xingpeng Jiang ₃

Affiliation

BACKGROUND In order to better help doctors make decision in the clinical setting, research is necessary to connect electronic health record (EHR) with the biomedical literature. Pseudo Relevance Feedback (PRF) is a kind of classical query modification technique that has shown to be effective in many retrieval models and thus suitable for handling terse language and clinical jargons in EHR. Previous work has introduced a set of constraints (axioms) of traditional PRF model. However, in the feedback document, the importance degree of candidate term and the co-occurrence relationship between a candidate term and a query term. Most methods do not consider both of these factors. Intuitively, terms that have higher co-occurrence degree with a query term are more likely to be related to the query topic. METHODS In this paper, we incorporate original HAL model into the Rocchio's model, and propose a new concept of term proximity feedback weight. A HAL-based Rocchio's model in the query expansion, called HRoc, is proposed. Meanwhile, we design three normalization methods to better incorporate proximity information to query expansion. Finally, we introduce an adaptive parameter to replace the length of sliding window of HAL model, and it can select window size according to document length. RESULTS Based on 2016 TREC Clinical Support medicine dataset, experimental results demonstrate that the proposed HRoc and HRoc_AP models superior to other advanced models, such as PRoc2 and TF-PRF methods on various evaluation metrics. Among them, compared with the Proc2 and TF-PRF models, the MAP of our model is increased by 8.5% and 12.24% respectively, while the F1 score of our model is increased by 7.86% and 9.88% respectively. CONCLUSIONS The proposed HRoc model can effectively enhance the precision and the recall rate of Information Retrieval and gets a more precise result than other models. Furthermore, after introducing self-adaptive parameter, the advanced HRoc_AP model uses less hyper-parameters than other models while enjoys an equivalent performance, which greatly improves the efficiency and applicability of the model and thus helps clinicians to retrieve clinical support document effectively.

中文翻译：

用于临床决策支持检索的基于自适应术语邻近性的罗基奥模型。

背景技术为了更好地帮助医生在临床环境中做出决策，有必要研究将电子健康记录（EHR）与生物医学文献联系起来。伪相关反馈（PRF）是一种经典的查询修改技术，已被证明在许多检索模型中有效，因此适合处理 EHR 中的简洁语言和临床术语。之前的工作引入了传统 PRF 模型的一组约束（公理）。然而，在反馈文档中，候选词的重要程度以及候选词与查询词之间的同现关系。大多数方法不考虑这两个因素。直观上，与查询词共现度越高的词越有可能与查询主题相关。方法本文将原有的HAL模型融入Rocchio模型中，提出术语邻近反馈权重的新概念。提出了一种基于 HAL 的 Rocchio 查询扩展模型，称为 HRoc。同时，我们设计了三种标准化方法，以更好地将邻近信息纳入查询扩展。最后，我们引入一个自适应参数来代替HAL模型的滑动窗口长度，它可以根据文档长度选择窗口大小。结果基于2016 TREC临床支持医学数据集，实验结果表明，所提出的HRoc和HRoc_AP模型在各种评估指标上优于其他先进模型，例如PRoc2和TF-PRF方法。其中，与Proc2和TF-PRF模型相比，我们模型的MAP分别提高了8.5%和12.24%，而我们模型的F1分数分别提高了7.86%和9.88%。结论所提出的HRoc模型可以有效提高信息检索的精度和召回率，并获得比其他模型更精确的结果。此外，先进的HRoc_AP模型在引入自适应参数后，在性能相当的情况下，比其他模型使用更少的超参数，大大提高了模型的效率和适用性，从而帮助临床医生有效检索临床支持文档。

更新日期：2019-12-12

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11