当前位置: X-MOL 学术Eur. Radiol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Identification of patients with carotid stenosis using natural language processing.
European Radiology ( IF 5.9 ) Pub Date : 2020-02-26 , DOI: 10.1007/s00330-020-06721-z
Xiao Wu 1 , Yuzhe Zhao 2 , Dragomir Radev 3 , Ajay Malhotra 4
Affiliation  

PURPOSE The highly structured nature of medical reports makes them feasible for automated large-scale patient identification. This study aimed to develop a natural language processing (NLP) model to retrospectively retrieve patients with presence and history of carotid stenosis (CS) using their ultrasound reports. METHODS Ultrasound reports from our institution between January 2016 and December 2017 were selected. To process the texts, we developed a parser to divide the raw text into fields. For baseline method, we used bag-of-n-grams and term frequency inverse document frequency as the features and used linear classifiers. Logistic regression was performed as the baseline model. Convolution and recurrent neural networks (CNN; RNN) with attention mechanism were applied to the dataset to improve the classification accuracy. RESULTS We had 1220 ultrasound reports for training and 307 for testing, totaling to 1527 reports. For predicting history of CS, both CNN and RNN-attention models had a significantly higher specificity than logistic regression. In addition, RNN-attention also had a significantly higher F1 score and accuracy. For predicting presence of carotid stenosis, all models achieved above 93% accuracy. RNN-attention achieved a 95.4% accuracy, although the difference with logistic regression was not statistically significant. RNN-attention had a statistically significant higher specificity than logistic regression. CONCLUSIONS We developed linear, CNN, and RNN models to predict history and presence of CS from ultrasound reports. We have demonstrated NLP to be an efficient, accurate approach for large-scale retrospective patient identification, with applications in long-term follow-up of patients and clinical research studies. KEY POINTS • Natural language processing models using both linear classifiers and neural networks can achieve a good performance, with an overall accuracy above 90% in predicting history and presence of carotid stenosis. • Convolution and recurrent neural networks, especially with additional features including field awareness and attention mechanism, have superior performance than traditional linear classifiers. • NLP is shown to be an efficient approach for large-scale retrospective patient identification, with applications in long-term follow-up of patients and further clinical research studies.

中文翻译:

使用自然语言处理识别患有颈动脉狭窄的患者。

目的医疗报告的高度结构化性质使其可用于自动大规模患者识别。这项研究旨在开发一种自然语言处理(NLP)模型,以使用其超声报告回顾性检索具有颈动脉狭窄(CS)存在和病史的患者。方法选择2016年1月至2017年12月间来自我们机构的超声报告。为了处理文本,我们开发了一个解析器,将原始文本划分为多个字段。对于基线方法,我们使用n-gram袋和术语频率逆文档频率作为特征,并使用线性分类器。进行逻辑回归作为基线模型。将具有注意机制的卷积和递归神经网络(CNN; RNN)应用于数据集以提高分类准确性。结果我们有1220份超声报告进行了培训,其中307份报告进行了测试,总计1527份报告。为了预测CS的病史,CNN和RNN注意模型均比logistic回归具有更高的特异性。此外,RNN注意也具有明显更高的F1分数和准确性。为了预测颈动脉狭窄的存在,所有模型均达到了93%以上的准确性。尽管与逻辑回归的差异在统计学上不显着,但RNN注意达到了95.4%的准确性。与逻辑回归相比,RNN注意在统计学上具有更高的特异性。结论我们开发了线性,CNN和RNN模型,以根据超声报告预测CS的历史和存在。我们已经证明NLP是用于大规模回顾性患者识别的有效,准确的方法,在患者的长期随访和临床研究中的应用。要点•同时使用线性分类器和神经网络的自然语言处理模型可以实现良好的性能,在预测颈动脉狭窄的历史和存在方面,其总体准确率超过90%。•卷积和递归神经网络,特别是具有场感知和注意机制等附加功能的性能要优于传统的线性分类器。•NLP被证明是用于大规模回顾性患者识别的有效方法,可应用于患者的长期随访和进一步的临床研究。要点•同时使用线性分类器和神经网络的自然语言处理模型可以实现良好的性能,在预测颈动脉狭窄的历史和存在方面,其总体准确率超过90%。•卷积和递归神经网络,特别是具有场感知和注意机制等附加功能的性能要优于传统的线性分类器。•NLP被证明是用于大规模回顾性患者识别的有效方法,可应用于患者的长期随访和进一步的临床研究。要点•同时使用线性分类器和神经网络的自然语言处理模型可以实现良好的性能,在预测颈动脉狭窄的历史和存在方面,其总体准确率超过90%。•卷积和递归神经网络,特别是具有场感知和注意机制等附加功能的性能要优于传统的线性分类器。•NLP被证明是用于大规模回顾性患者识别的有效方法,可应用于患者的长期随访和进一步的临床研究。具有比传统线性分类器优越的性能。•NLP被证明是用于大规模回顾性患者识别的有效方法,可应用于患者的长期随访和进一步的临床研究。具有比传统线性分类器优越的性能。•NLP被证明是用于大规模回顾性患者识别的有效方法,可应用于患者的长期随访和进一步的临床研究。
更新日期:2020-02-26
down
wechat
bug