当前位置: X-MOL 学术arXiv.cs.CY › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
AutoDiscern: Rating the Quality of Online Health Information with Hierarchical Encoder Attention-based Neural Networks
arXiv - CS - Computers and Society Pub Date : 2019-12-30 , DOI: arxiv-1912.12999
Laura Kinkead, Ahmed Allam, Michael Krauthammer

Patients increasingly turn to search engines and online content before, or in place of, talking with a health professional. Low quality health information, which is common on the internet, presents risks to the patient in the form of misinformation and a possibly poorer relationship with their physician. To address this, the DISCERN criteria (developed at University of Oxford) are used to evaluate the quality of online health information. However, patients are unlikely to take the time to apply these criteria to the health websites they visit. We built an automated implementation of the DISCERN instrument (Brief version) using machine learning models. We compared the performance of a traditional model (Random Forest) with that of a hierarchical encoder attention-based neural network (HEA) model using two language embeddings, BERT and BioBERT. The HEA BERT and BioBERT models achieved average F1-macro scores across all criteria of 0.75 and 0.74, respectively, outperforming the Random Forest model (average F1-macro = 0.69). Overall, the neural network based models achieved 81% and 86% average accuracy at 100% and 80% coverage, respectively, compared to 94% manual rating accuracy. The attention mechanism implemented in the HEA architectures not only provided 'model explainability' by identifying reasonable supporting sentences for the documents fulfilling the Brief DISCERN criteria, but also boosted F1 performance by 0.05 compared to the same architecture without an attention mechanism. Our research suggests that it is feasible to automate online health information quality assessment, which is an important step towards empowering patients to become informed partners in the healthcare process.

中文翻译:

AutoDiscern:使用基于分层编码器注意力的神经网络评估在线健康信息的质量

越来越多的患者在与健康专业人士交谈之前或代替他们使用搜索引擎和在线内容。互联网上常见的低质量健康信息以错误信息的形式给患者带来风险,并且可能与医生的关系变差。为了解决这个问题,使用 DISCERN 标准(由牛津大学开发)来评估在线健康信息的质量。但是,患者不太可能花时间将这些标准应用于他们访问的健康网站。我们使用机器学习模型构建了 DISCERN 仪器(简要版)的自动化实现。我们使用两种语言嵌入(BERT 和 BioBERT)将传统模型(随机森林)的性能与基于分层编码器注意力的神经网络 (HEA) 模型的性能进行了比较。HEA BERT 和 BioBERT 模型在所有标准上的平均 F1-macro 分数分别为 0.75 和 0.74,优于随机森林模型(平均 F1-macro = 0.69)。总体而言,基于神经网络的模型在 100% 和 80% 的覆盖率下分别实现了 81% 和 86% 的平均准确率,而手动评分准确率为 94%。在 HEA 架构中实施的注意力机制不仅通过为满足 Brief DISCERN 标准的文档识别合理的支持句子来提供“模型可解释性”,而且与没有注意力机制的相同架构相比,F1 性能提高了 0.05。我们的研究表明,自动化在线健康信息质量评估是可行的,
更新日期:2020-05-27
down
wechat
bug