Evaluating healthcare quality using natural language processing

From Clinfowiki
Revision as of 18:15, 17 November 2015 by TheoBiblio (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

A Review of Baldwin KB. Evaluating healthcare quality using natural language processing. J Healthcare Qual 2008; 30: 24-29.

Introduction

One of the most pervasive trends in modern healthcare is the focus on quality assessment. The methodology for routine quality assessment is in its infancy. A large amount of healthcare data is available only in narrative formats (histories and physicals, progress notes, discharge summaries, physician orders) and evaluating such data for adverse events or failure to utilize evidence-based diagnostic/therapeutic measures is time consuming and labor intensive. Natural language processing is a possible technique to extract relevant structured data from narrative data.

Methods

EHRs for 60 woman, 40 years of age or older, who visited a large academic medical center in 2001 were taken as the study sample. These EHRs were studied for a period of 2.5 years. The following variables were extracted, using manual coding and natural language processing: demographic variables (age, race, gender), risk factors for breast cancer (positive genetic screening, positive family history, personal history of cancer, breast complaints such as pain, masses, discharge), provider variables (screening and test results or follow-up), and mortality. Natural language processing was accomplished with the NUD*IST software package. Natural language processing data extraction was compared to manual extraction on the basis of efficiency (time required), false-positive rate, precision (the ratio of the number of documents for which desired information was retrieved to the total number of documents for which any information was retrieved), false-negative rate, and recall (ratio of number of documents retrieved with desired information to total number of documents in database with desired information).

Results

Natural language processing was more efficient than manual data extraction and the difference was statistically significant. Both techniques retrieved race and gender with 100% frequency. Risk factors in the study population were too rare to merit comparison. There was not a statistically significant difference between the two techniques in the extraction of provider orders for follow-up. There was a statistically significant false-negative rate (21/60) for the detection of screening measures by natural language processing. The overall recall rate of natural language processing was only 0.293 while the precision was 0.709.

Discussion

Natural language processing, as implemented with NU*DIST, was less time-consuming than manual data extraction and performed well retrieving variables that were consistently defined , such as gender, race, and test results. Other variables were not captured as well. The challenging variables, as would be expected, were those characterized by greater provider terminology variability. This led to an overall poor recall rate. The poor recall rate may reflect more the specific software package, NU*DIST, developed for qualitative research, than the concept. Improvements in the EHR may improve the effectiveness of natural language processing.

Comments

Natural language processing (NLP) is a promising tool to reconcile the clinician’s propensity for narrative data with the advantages of structured data in medical informatics. However, in this particular study and with this particular software package, natural language processing was relatively efficient and precision was excellent, but recall of some variables was disappointing. The challenge is the variability, context-sensitivity, and inconsistency of medical terminology. (For those who are interested, NUD*IST or Non numerical Unstructured Data Indexing Searching and Theorizing, is a qualitative data research tool developed by QSR International, which is, according to Wikipedia, the world’s largest qualitative research software developer).

James Bailey

Related Articles

Use of a support vector machine for categorizing free-text notes

Accessing primary care Big Data: the development of a software algorithm to explore the rich content of consultation records