Comparison of rule-based and neural network models for negation detection in radiology reports

D. Sykes; A. Grivas; C. Grover; R. Tobin; C. Sudlow; W. Whiteley; A. Mcintosh; H. Whalley; B. Alex

doi:10.1017/S1351324920000509

Comparison of rule-based and neural network models for negation detection in radiology reports

Published online by Cambridge University Press: 18 November 2020

D. Sykes ,

A. Grivas ,

C. Grover ,

R. Tobin ,

C. Sudlow ,

H. Whalley and

D. Sykes*: Affiliation:
Division of Psychiatry, Centre for Clinical Brain Sciences
A. Grivas: Affiliation:
Institute for Language, Cognition and Computation, School of Informatics
C. Grover: Affiliation:
Institute for Language, Cognition and Computation, School of Informatics
R. Tobin: Affiliation:
Institute for Language, Cognition and Computation, School of Informatics
C. Sudlow: Affiliation:
Usher Institute of Population Health Sciences and Informatics
W. Whiteley: Affiliation:
Centre for Clinical Brain Sciences, Edinburgh Medical School
A. Mcintosh: Affiliation:
Division of Psychiatry, Centre for Clinical Brain Sciences
H. Whalley: Affiliation:
Division of Psychiatry, Centre for Clinical Brain Sciences
B. Alex: Affiliation:
Institute for Language, Cognition and Computation, School of Informatics Edinburgh Futures Institute, School of Literatures, Languages and Cultures, University of Edinburgh, Edinburgh, UK
*: *Corresponding author. E-mail: dominic.sykes@ed.ac.uk

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Using natural language processing, it is possible to extract structured information from raw text in the electronic health record (EHR) at reasonably high accuracy. However, the accurate distinction between negated and non-negated mentions of clinical terms remains a challenge. EHR text includes cases where diseases are stated not to be present or only hypothesised, meaning a disease can be mentioned in a report when it is not being reported as present. This makes tasks such as document classification and summarisation more difficult. We have developed the rule-based EdIE-R-Neg, part of an existing text mining pipeline called EdIE-R (Edinburgh Information Extraction for Radiology reports), developed to process brain imaging reports, (https://www.ltg.ed.ac.uk/software/edie-r/) and two machine learning approaches; one using a bidirectional long short-term memory network and another using a feedforward neural network. These were developed on data from the Edinburgh Stroke Study (ESS) and tested on data from routine reports from NHS Tayside (Tayside). Both datasets consist of written reports from medical scans. These models are compared with two existing rule-based models: pyConText (Harkema et al. 2009. Journal of Biomedical Informatics42(5), 839–851), a python implementation of a generalisation of NegEx, and NegBio (Peng et al. 2017. NegBio: A high-performance tool for negation and uncertainty detection in radiology reports. arXiv e-prints, p. arXiv:1712.05898), which identifies negation scopes through patterns applied to a syntactic representation of the sentence. On both the test set of the dataset from which our models were developed, as well as the largely similar Tayside test set, the neural network models and our custom-built rule-based system outperformed the existing methods. EdIE-R-Neg scored highest on F1 score, particularly on the test set of the Tayside dataset, from which no development data were used in these experiments, showing the power of custom-built rule-based systems for negation detection on datasets of this size. The performance gap of the machine learning models to EdIE-R-Neg on the Tayside test set was reduced through adding development Tayside data into the ESS training set, demonstrating the adaptability of the neural network models.

Keywords

Machine learning Natural language processing for biomedical texts Corpus annotation Information extraction Text data mining

Type: Article
Information: Natural Language Engineering , Volume 27 , Issue 2 , March 2021 , pp. 203 - 224

DOI: https://doi.org/10.1017/S1351324920000509 [Opens in a new window]
Copyright: © The Author(s), 2020. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Alex, B., Grover, C., Tobin, R., Sudlow, C., Mair, G. and Whiteley, W. (2019). Text mining brain imaging reports. Journal of Biomedical Semantics 10, 23.CrossRef Google Scholar PubMed

Alsentzer, E., Murphy, J., Boag, W., Weng, W.-H., Jindi, D., Naumann, T. and McDermott, M. (2019). Publicly available clinical BERT embeddings. In Proceedings of the 2nd Clinical Natural Language Processing Workshop, Minneapolis, Minnesota, USA. Association for Computational Linguistics, pp. 72–78.CrossRef Google Scholar

Ba, J.L., Kiros, J.R. and Hinton, G.E. (2016). Layer normalization. arXiv e-prints, p. arXiv:1607.06450.Google Scholar

Bengio, Y., Simard, P. and Frasconi, P. (1994). Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks 5(2), 157–166.CrossRef Google Scholar PubMed

Chapman, W., Bridewell, W., Hanbury, P., Cooper, G.F. and Buchanan, B. (2001). A simple algorithm for identifying negated findings and diseases in discharge summaries. Journal of Biomedical Informatics 34, 301–310.CrossRef Google Scholar PubMed

Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement 20(1), 37–46.CrossRef Google Scholar

Cornegruta, S., Bakewell, R., Withey, S. and Montana, G. (2016). Modelling radiological language with bidirectional long short-term memory networks. CoRR, abs/1609.08409.CrossRef Google Scholar

Cruz, N.P., Taboada, M. and Mitkov, R. (2017). A machine-learning approach to negation and speculation detection for sentiment analysis. Journal of the Association for Information Science and Technology 67(9), 2118–2136.CrossRef Google Scholar

Fancellu, F., Lopez, A. and Webber, B. (2016). Neural networks for negation scope detection. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Berlin, Germany. Association for Computational Linguistics, pp. 495–504.CrossRef Google Scholar

Gorinski, P.J., Wu, H., Grover, C., Tobin, R., Talbot, C., Whalley, H., Sudlow, C., Whiteley, W. and Alex, B. (2019). Named entity recognition for electronic health records: A comparison of rule-based and machine learning approaches. arXiv e-prints, p. arXiv:1903.03985.Google Scholar

Goryachev, S., Sordo, M., Zeng, Q.T. and Ngo, L. (2006). Implementation and Evaluation of Four Different Methods of Negation Detection. Boston, MA: DSG.Google Scholar

Graves, A. and Schmidhuber, J. (2005). Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Networks 18(5–6), 602–610.CrossRef Google Scholar PubMed

Grivas, A., Alex, B., Grover, C., Tobin, R. and Whiteley, W. (2020). Not a cute stroke: Analysis of Rule- and Neural Network-Based Information Extraction Systems for Brain Radiology Reports. In Proceedings of the 11th International Workshop on Health Text Mining and Information Analysis (LOUHI 2020) at EMNLP 2020.Google Scholar

Grover, C. and Tobin, R. (2006). Rule-based chunking and reusability. In Proceedings of LREC 2006, pp. 873–878.Google Scholar

Harkema, H., Dowling, J.N., Thornblade, T. and Chapman, W.W. (2009). Context: An algorithm for determining negation, experiencer, and temporal status from clinical reports. Journal of Biomedical Informatics 42(5), 839–851. Biomedical Natural Language Processing.CrossRef Google Scholar PubMed

Hochreiter, S. and Schmidhuber, J. (1997). Long short-term memory. Neural Computation 9(8), 1735–1780.CrossRef Google Scholar PubMed

Horng, S., Sontag, D.A., Halpern, Y., Jernite, Y., Shapiro, N.I. and Nathanson, L.A. (2017). Creating an automated trigger for sepsis clinical decision support at emergency department triage using machine learning. PLOS ONE 12(4), 1–16.CrossRef Google Scholar PubMed

Hripcsak, G. and Rothschild, A.S. (2005). Agreement, the f-measure, and reliability in information retrieval. Journal of the American Medical Informatics Association 12(3), 296–298.CrossRef Google Scholar PubMed

Huang, Y. and Lowe, H. (2007). A novel hybrid approach to automated negation detection in clinical radiology reports. Journal of the American Medical Informatics Association : JAMIA 14, 304–311.CrossRef Google Scholar PubMed

Jackson, C., Crossland, L., Dennis, M., Wardlaw, J. and Sudlow, C. (2008). Assessing the impact of the requirement for explicit consent in a hospital-based stroke study. QJM: Monthly Journal of the Association of Physicians 101(4), 281–289.CrossRef Google Scholar

Kingma, D.P. and Ba, J. (2015). Adam: A method for stochastic optimization. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, 7–9 May, 2015, Conference Track Proceedings.Google Scholar

Maldonado, R., Goodwin, T. and Harabagiu, S.M. (2017). Active deep learning-based annotation of electroencephalography reports for cohort identification. In CRI, vol. 2017, pp. 229–238.Google Scholar

Manning, C.D., Surdeanu, M., Bauer, J., Finkel, J., Bethard, S.J. and McClosky, D. (2014). The Stanford CoreNLP natural language processing toolkit. In Association for Computational Linguistics (ACL) System Demonstrations, pp. 55–60.CrossRef Google Scholar

Mehrabi, S., Krishnan, A., Sohn, S., Roch, A.M., Schmidt, H., Kesterson, J., Beesley, C., Dexter, P., Schmidt, C.M., Liu, H. and Palakal, M. (2015). Deepen: A negation detection system for clinical text incorporating dependency relation into negex. Journal of Biomedical Informatics 54, 213–219.CrossRef Google Scholar PubMed

Mou, L., Meng, Z., Yan, R., Li, G., Xu, Y., Zhang, L. and Jin, Z. (2016). How transferable are neural networks in NLP applications? arXiv e-prints, p. arXiv:1603.06111.Google Scholar

Mutalik, P., Deshpande, A.M. and Nadkarni, P.M. (2001). Research paper: Use of general-purpose negation detection to augment concept indexing of medical documents: A quantitative study using the UMLS. Journal of the American Medical Informatics Association: JAMIA 8(6), 598–609.CrossRef Google Scholar

Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L. and Lerer, A. (2017). Automatic differentiation in pytorch. In NIPS-W.Google Scholar

Peng, Y., Wang, X., Lu, L., Bagheri, M., Summers, R. and Lu, Z. (2017). NegBio: A high-performance tool for negation and uncertainty detection in radiology reports. arXiv e-prints, p. arXiv:1712.05898.Google Scholar

Peng, Y., Yan, K., Sandfort, V., Summers, R.M. and Lu, Z. (2019). A self-attention based deep learning method for lesion attribute detection from ct reports. arXiv preprint arXiv:1904.13018.Google Scholar

Polikar, R. (2006). Ensemble based systems in decision making. IEEE Circuits and Systems Magazine 6(3), 21–45.CrossRef Google Scholar

Pons, E., Braun, L.M.M., Hunink, M.G.M. and Kors, J.A. (2016). Natural language processing in radiology: A systematic review. Radiology 279(2), 329–343.CrossRef Google Scholar PubMed

Pratt, L.Y., Mostow, J. and Kamm, C.A. (1991). Direct transfer of learned information among neural networks. In Proceedings of the Ninth National Conference on Artificial Intelligence - Volume 2, AAAI91. AAAI Press, pp. 584–589.Google Scholar

Stenetorp, P., Pyysalo, S., Topić, G., Ohta, T., Ananiadou, S. and Tsujii, J. (2012). brat: A web-based tool for NLP-assisted text annotation. In Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics, Avignon, France. Association for Computational Linguistics, pp. 102–107.Google Scholar

Taylor, S. and Harabagiu, S. (2018). The role of a deep-learning method for negation detection in patient cohort identification from electroencephalography reports. Proceedings of the AMIA Annual Symposium 2018, 1018–1027.Google Scholar

Tjong Kim Sang, E.F. (2002). Introduction to the conll-2002 shared task: Language-independent named entity recognition. In Proceedings of the 6th Conference on Natural Language Learning - Volume 20, COLING-02, Stroudsburg, PA, USA. Association for Computational Linguistics, pp. 1–4.CrossRef Google Scholar

Uzuner, Ö., South, B.R., Shen, S. and DuVall, S.L. (2011). 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text. Journal of the American Medical Informatics Association 18(5), 552–556.CrossRef Google Scholar PubMed

Wang, X., Peng, Y., Lu, L., Lu, Z., Bagheri, M. and Summers, R.M. (2017). Chestx-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. CoRR, .CrossRef Google Scholar

Wu, S., Miller, T., Masanz, J., Coarr, M., Halgrim, S., Carrell, D. and Clark, C. (2014). Negation’s not solved: Generalizability versus optimizability in clinical natural language processing. PLOS ONE 9(11), 1–11.CrossRef Google Scholar

Article contents

Comparison of rule-based and neural network models for negation detection in radiology reports

Abstract

Keywords

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests