当前位置: X-MOL 学术Med. Biol. Eng. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Information extraction for prognostic stage prediction from breast cancer medical records using NLP and ML
Medical & Biological Engineering & Computing ( IF 3.2 ) Pub Date : 2021-07-23 , DOI: 10.1007/s11517-021-02399-7
Pratiksha R Deshmukh 1, 2 , Rashmi Phalnikar 1
Affiliation  

For cancer prediction, the prognostic stage is the main factor that helps medical experts to decide the optimal treatment for a patient. Specialists study prognostic stage information from medical reports, often in an unstructured form, and take a larger review time. The main objective of this study is to suggest a generic clinical decision-unifying staging method to extract the most reliable prognostic stage information of breast cancer from medical records of various health institutions. Additional prognostic elements should be extracted from medical reports to identify the cancer stage for getting an exact measure of cancer and improving care quality. This study has collected 465 pathological and clinical reports of breast cancer sufferers from India’s reputed medical institutions. The unstructured records were found distinct from each institute. Anatomic and biologic factors are extracted from medical records using the natural language processing, machine learning and rule-based method for prognostic stage detection. This study has extracted anatomic stage, grade, estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor 2 (HER2) from medical reports with high accuracy and predicted prognostic stage for both regions. The prognostic stage prediction’s average accuracy is found 92% and 82% in rural and urban areas, respectively. It was essential to combine biological and anatomical elements under a single prognostic staging method. A generic clinical decision-unifying staging method for prognostic stage detection with great accuracy in various institutions of different regional areas suggests that the proposed research improves the prognosis of breast cancer.

Graphical abstract



中文翻译:

使用 NLP 和 ML 从乳腺癌病历中提取预后分期信息

对于癌症预测,预后分期是帮助医学专家为患者确定最佳治疗方案的主要因素。专家从医学报告中研究预后阶段信息,通常采用非结构化的形式,并需要更长的审查时间。本研究的主要目的是提出一种通用的临床决策统一分期方法,以从各种卫生机构的医疗记录中提取最可靠的乳腺癌预后分期信息。应从医疗报告中提取其他预后因素,以确定癌症分期,以便准确测量癌症并提高护理质量。本研究收集了来自印度知名医疗机构的 465 份乳腺癌患者的病理和临床报告。发现非结构化记录与每个研究所不同。使用自然语言处理、机器学习和基于规则的方法从病历中提取解剖和生物因素,用于预后阶段检测。本研究从医学报告中提取了解剖分期、分级、雌激素受体 (ER)、孕激素受体 (PR) 和人表皮生长因子受体 2 (HER2),具有较高的准确性和预测的两个区域的预后分期。在农村和城市地区,预后阶段预测的平均准确度分别为 92% 和 82%。在单一的预后分期方法下结合生物学和解剖学元素是必不可少的。

图形概要

更新日期:2021-07-23
down
wechat
bug