当前位置: X-MOL 学术IEEE Access › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Determining the Efficiency of Drugs Under Special Conditions From Users’ Reviews on Healthcare Web Forums
IEEE Access ( IF 3.9 ) Pub Date : 2021-06-14 , DOI: 10.1109/access.2021.3088838
Eysha Saad , Sadia Din , Ramish Jamil , Furqan Rustam , Arif Mehmood , Imran Ashraf , Gyu Sang Choi

Sentiment analysis is the extraction and categorization of sentiments that have been expressed in text data using text analysis techniques. Manifested by earlier studies, sentiment analysis of drug reviews has a large potential for providing valuable insights to assist healthcare professionals and companies for evaluating the safety of drugs after it has been marketed. Such insights help safeguard patients and increase their trust in medical companies. The existing systems either follow a lexicon-based approach or a learning-based approach for sentiment analysis in the medical domain. Learning-based techniques require annotated data while lexicon-based techniques tend to be domain-specific which restricts their wide use. This research embarks on a hybrid technique that utilizes both learning-based and lexicon-based approaches to achieve better results. General-purpose sentiment lexicons, such as AFFIN, TextBlob, and VADER, are used for annotating the reviews. Furthermore, several feature engineering techniques, such as term frequency (TF), term frequency-inverse document frequency (TF-IDF), and union of TF and TF-IDF (TF U TF-IDF) have been incorporated for the extraction of useful features. Finally, the learning models including logistic regression (LR), AdaBoost classifier (AB), random forest (RF), extra tree classifier (ETC), and multilayer perceptron (MLP) are used to classify sentiments of the reviews. The performance of the proposed hybrid approach is evaluated using accuracy, precision, recall, and F1-score. Experimental results indicate that the combination of learning-based and lexicon-based approaches provide improved results than their individual use. Moreover, TextBlob has shown promising results giving an accuracy of 96% with MLP when used with TF-IDF and with LR when used with TF U TF-IDF.

中文翻译:

从用户对医疗保健网络论坛的评论确定特殊条件下的药物效率

情感分析是使用文本分析技术对文本数据中表达的情感进行提取和分类。早期的研究表明,药物评论的情感分析具有很大的潜力,可以提供有价值的见解,以帮助医疗保健专业人士和公司在药物上市后评估药物的安全性。这些见解有助于保护患者并增加他们对医疗公司的信任。现有系统要么遵循基于词典的方法,要么遵循基于学习的方法进行医学领域的情感分析。基于学习的技术需要带注释的数据,而基于词典的技术往往是特定领域的,这限制了它们的广泛使用。这项研究开始采用一种混合技术,该技术利用基于学习和基于词典的方法来获得更好的结果。AFFIN、TextBlob 和 VADER 等通用情感词典用于注释评论。此外,还引入了几种特征工程技术,例如词频 (TF)、词频-逆文档频率 (TF-IDF) 以及 TF 和 TF-IDF 的联合 (TF U TF-IDF),以提取有用的特征。最后,使用逻辑回归 (LR)、AdaBoost 分类器 (AB)、随机森林 (RF)、额外树分类器 (ETC) 和多层感知器 (MLP) 等学习模型对评论的情感进行分类。使用准确率、准确率、召回率和 F1 分数评估所提出的混合方法的性能。实验结果表明,基于学习和基于词典的方法相结合提供了比单独使用更好的结果。此外,TextBlob 显示出有希望的结果,当与 TF-IDF 一起使用时,MLP 的准确率为 96%,与 TF U TF-IDF 一起使用时,与 LR 一起使用时准确率达到 96%。
更新日期:2021-06-22
down
wechat
bug