当前位置: X-MOL 学术J. Biomed. Inform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Adverse drug reaction detection on social media with deep linguistic features.
Journal of Biomedical informatics ( IF 4.0 ) Pub Date : 2020-04-29 , DOI: 10.1016/j.jbi.2020.103437
Ying Zhang 1 , Shaoze Cui 2 , Huiying Gao 3
Affiliation  

Adverse reactions caused by drugs are one of the most important public health problems. Social media has encouraged more patients to share their drug use experiences and has become a major source for the detection of professionally unreported adverse drug reactions (ADRs). Since a large number of user posts do not mention any ADR, accurate detection of the presence of ADRs in each user post is necessary before further research can be conducted. Previous feature-based methods focus on extracting more shallow linguistic features that are unable to capture deep and subtle information in the context, ultimately failing to provide satisfactory accuracy. To overcome the limitations of previous studies, this paper proposes a novel method that can extract deep linguistic features and then combine them with shallow linguistic features for ADR detection. We first extract predicate-ADR pairs under the guidance of extended syntactic dependencies and ADR lexicon. Then, we extract semantic and part-of-speech (POS) features for each pair and pool the features of different pairs to generate a holistic representation of deep linguistic features. Finally, we use the collection of deep features and several shallow features to train the predictive models. A series of experiments are performed on data sets collected from DailyStrength and Twitter. Our approach can achieve AUCs of 94.44% and 88.97% on the two data sets, respectively, outperforming other state-of-the-art methods. The results demonstrate the potential benefits of deep linguistic features for ADR detection on social data. This method can be applied to multiple other healthcare and text analysis tasks and can be used to support pharmacovigilance research.

中文翻译:

具有深层语言特征的社交媒体上的不良药物反应检测。

药物引起的不良反应是最重要的公共卫生问题之一。社交媒体鼓励更多的患者分享他们的药物使用经验,并已成为检测专业未报告的药物不良反应(ADR)的主要来源。由于大量用户帖子未提及任何ADR,因此在进行进一步研究之前,必须准确检测每个用户帖子中ADR的存在。以前的基于特征的方法着重于提取更浅的语言特征,这些特征无法在上下文中捕获深层和微妙的信息,最终无法提供令人满意的准确性。为了克服先前研究的局限性,本文提出了一种新颖的方法,该方法可以提取深层语言特征,然后将它们与浅层语言特征结合起来进行ADR检测。我们首先在扩展句法依存关系和ADR词典的指导下提取谓词ADR对。然后,我们为每对提取语义和词性(POS)特征,并合并不同对的特征以生成深层语言特征的整体表示。最后,我们使用深层特征和几个浅层特征的集合来训练预测模型。对从DailyStrength和Twitter收集的数据集进行了一系列实验。我们的方法可以在两个数据集上分别达到94.44%和88.97%的AUC,优于其他最新方法。结果表明,深层语言功能对于在社交数据上进行ADR检测具有潜在的好处。
更新日期:2020-04-29
down
wechat
bug