Pilot trial of semi-automated medical note writing using lexeme hypotheses

https://doi.org/10.1016/j.ijmedinf.2020.104095Get rights and content

Highlights

  • We tested computer program based on the lexeme hypotheses.

  • Notes generated by the system were more complete, grammatical and organized.

  • Notes were generated at a faster rate than traditional methods.

  • The system prompted users to consider practice advisories in a timely fashion.

  • Notes were completely computer-readable.

Abstract

Clinicians write a billion free text notes per year. These notes are typically replete with errors of all types. No established automated method can extract data from this treasure trove. The practice of medicine therefore remains haphazard and chaotic, resulting in vast economic waste.

The lexeme hypotheses are based on our analysis of how records are created. They enable a computer system to predict what issue a clinician will need to address next, based on the environment in which the clinician is working, and what responses the clinician has selected to date. The system uses a lexicon storing the issues (queries) and a range of responses to the issues. When the clinician selects a response, a text fragment is added to the output file.

In the first phase of this work, the notes of 69 returning hemophilia patients were scrutinized, and the lexicon was expanded to 847 lexeme queries and 7995 responses to enable the construction of completed notes.

The quality of lexeme-generated notes from 20 consecutive subjects was then compared to the clinicians’ conventional clinic notes. The system generated grammatically correct notes. In comparison to the traditional clinic note, the lexeme-generated notes were more complete (88 % compared with 62 %), and had less typographical and grammatical errors (0.8 versus 3.5 errors per note). The system notes and traditional notes averaged about 800 words, but the traditional notes had a much wider distribution of lengths. The note-creation rate from marshalling the data to completion using the system averaged 80 wpm, twice as fast as the typical clinician can type.

The lexeme method generates more complete, grammatical and organized notes faster than traditional methods. The notes are completely computerized at inception, and they incorporate prompts for clinicians to address otherwise overlooked items. This pilot justifies further exploration of this methodology.

Introduction

The diagnosis and management of a patient’s condition derives from the opinion of the treating clinician. This opinion is based on the patient’s history, physical exam and other results, assembled with the aid of years of learning and experience. The clinician memorializes each significant clinical interaction with the patient as a chart note consisting of the clinician’s “own unfettered, colloquial conceptualizations of the patient’s descriptors” [1]. This note reinforces the memory of the treating physician and it informs other caregivers about the case. The note is often scrutinized to determine if the clinical event reaches minimal requirements for billing. Clinicians’ notes are the most important source for legal actions in both disciplinary and malpractice actions, and occasionally in clinical trials.

Clinicians in the United States prepare about 1 billion outpatient notes per year [2], and about the same number of inpatient notes. These clinical notes contain a vast trove of medical data that could (in theory) be used routinely to determine how the current practice of medicine can be improved. To this end, much effort has been expended to extract the contents of free-text notes by natural language processing. Regrettably, the best systems are found to be reliable only when tuned to a specific task: natural language processing is not yet generalizable or scalable [3] so that extracting accurate data from free-text clinical notes by computer is currently impossible. The cost of harvesting accurate data from notes for clinical or malpractice trials runs to tens of thousands of dollars per patient [4,5]. The vast majority of clinicians’ findings, opinions and outcomes are consigned to an electronic medical record (EMR) as unanalyzable free-text documents, read (if at all) only by clinicians rendering further care to the patient.

Despite its huge systemic importance, very little academic study has been devoted to the structure or content of the free-text note. We do know that writing it often takes longer than seeing the patient [6]. Notes tend to be full of grammatical, logical and factual errors, as well as unnecessary, repetitive and contradictory statements [7,8]. Free-text notes are written by harried clinicians who want to spend more time with the patient and less time writing notes [6]. The modern EMR tends to increase note-bloat by encouraging the copying in of extraneous and irrelevant prior text (complete with its own errors) into a new note. Clinicians are comforted by the fact that the errors they make when writing notes rarely result in an adverse outcome for their patients – their colleagues almost unconsciously overlook them. Administrators rarely assess note quality, and they not noted rewarding excellent note writing.

There is increasing systemic urgency to determine what information is recorded in notes. Our rapidly increasing understanding of disease and the sophistication of modern therapies add urgency to the need to monitor in real-time the performance of clinical care. This urgency also increases the value of accurately prompting clinicians to explore best practice advisories in a timely and non-confrontational fashion, which can only be done by a system that accurately extracts information from the note [9].

To address this issue, we analyzed how we approach the creation of professional records, leading to several observations and hypotheses. At the heart of this analysis we use the concept of a lexeme, defined as the intellectual content of the smallest unit of information. A lexeme can be expressed in any language and any style – including a grammatically correct English phrase or an assigned computer code. Lexemes can usefully be regarded as a logical combination of a lexeme query (which defines the issue being addressed), and a lexeme response which addresses the issue. Most lexeme queries need only a handful of responses to cover the needed answers. This approach allows us to examine how we generate notes is much more rigorous fashion than previously possible [10], and it generates three useful hypotheses that we use to predict what issue (or query) a clinician will need to address next when writing a note. These hypotheses assume that we have constructed a large library of lexemes and their associated responses in a lexicon.

The first hypothesis is based on the observation that that within every note, every lexeme query has a most-appropriate location. Extending this we can postulate that the entire lexicon (encompassing the entirety of medical practice) can be organized in a specified order, establishing the relative position of every lexeme query within it. We refer to this phenomenon as coherence. The second hypothesis is predicance, asserting that every lexeme query addressed in the note is predicted to be needed either by the context within which the note is written (i.e. specialty of the clinician and the location of the patient), or by responses already lodged the note. As a practical matter, responses can issue predicants, calling for future examination of an issue. The third hypothesis is that each lexeme response can indicate whether the user wants to explore a topic in more detail or wishes to move on to another topic (ie, sets a hierarchical level). These concepts have been presented in more detail [11].

We have constructed a computer program to test the validity of these hypotheses. The program repeatedly presents a clinician with an issue (as a question) and offers a menu of likely responses. When the user selects an approriate response, a text fragment expressing the response is added to the output note, and the system seeks the next lexeme query by searching through the lexicon in coherence order for a lexeme query that has an adequate level and at least one predicant match. We have constructed a lexicon of queries and responses including those required to address the management of patients with hemophilia. Each selected lexeme response has a corresponding computer code which can be ported to another computer system for analysis. Essentially, this lexeme approach reverses the process of computerizing notes – rather than trying to extract content from text, the system expresses the needed information as text. Fig. 1 shows a screen shot of the user screen of the program, and a fragment of the lexicon listings.

Hemophilia is a rare bleeding disorder that is best managed by clotting factor replacement. A network of federally-funded hemophilia centers is established to improve the medical care of patients with hemophilia. The National Hemophilia Foundation and similar organizations have issued recommendations for patient management. There is uniform agreement that patients with hemophilia should be evaluated periodically by a multidisciplinary team, including hematologists, dedicated nurses, social services, etc., who perform a comprehensive assessment of each patient [12]. The Hemophilia Center at the University of Iowa invites patients to return for comprehensive outpatient clinic visits at least once every two years. During the course of each visit, the hematologist creates a clinical note that is lodged in the institution’s EPIC EMR. The majority of this note is generally constructing by copying and updating a previous clinic note. This paper describes a pilot trial explores the utility of using a lexeme-based method to construct a note suitable for this clinic.

Initially, we harvested notes from patients, and used them to add lexeme queries and their associated responses to the lexicon of the system. Subsequently, we used the system to construct clinic notes. Our intent was to test the validity of the lexeme theoretical approach and the resulting computer algorithm.

Section snippets

Methods

The computer program runs in the user’s web browser and accesses the lexicon from database in the cloud [13]. Six physicians participated in the trial – four seeing patients in the adult hemophilia clinic and two in the pediatric clinic. The pediatricians made use of their own libraries of prewritten text fragments stored in EPIC.

To start our work, the prior clinic note from 69 scheduled patients were de-identified and printed. These notes were examined, and transliterated into a lexeme notes

Phase 1, building the lexicon

The most recent note of 69 patients scheduled to be seen in a few days as a return patient in the Comprehensive Hemophilic Clinic were printed, de-identified and photocopied. The content of these de-identified photocopies were examined and were transliterated using the lexeme package. When the lexicon did not contain suitable language to address an issue in the clinic note, additional lexemes were added to the lexicon. The completed lexeme note was presented to the clinic physician, who was

Discussion

In the normal course of practice, the clinician seeing a new patient acquires some information about a patient from the EMR or clinic personnel, then interviews the patient and performs a physical exam. During this evaluation, the clinician often forms an opinion about the patient’s condition and best management by almost effortless pattern recognition. In contrast, the process of writing a clinic note is never instantaneous. It often takes longer to completethan it takes to perform a history

Declaration of Competing Interest

All authors received funding from a research grant-in-aid (IIR-USA-001548) from Shire, PLC, now Takeda Pharmaceutical Co. LTD.

Donald Macfarlane holds patents and has a proprietary interest in the described software.

Human subjects

This research was approved by, and conducted under the auspices of, the University Iowa Human Subjects Institutional Review Board.

Funding

This study was supported by a grant (IIR-USA-001548) from Shire, PLC, now Takeda Pharmaceutical Co. LTD (IIR-USA-001548), now Takeda Pharmaceutical Co. LTD .

Author Statement

This paper addresses an issue of great economic and medical importance – doctors write lousy notes that no computer can read. As a result, medical care is chaotic and unanalyzable. The paper describes a pilot clinical trial to evaluate a novel way of creating medical notes that are grammatically correct and completely

References (23)

  • P. Zazove et al.

    To act or not to act: responses to electronic health record prompts by family medicine clinicians

    J Am Med Inform Assn.

    (2017)
  • Cited by (0)

    View full text