Visualizing unstructured patient data for assessing diagnostic and therapeutic history

From Clinfowiki
Jump to: navigation, search

Having access to relevant patient data is crucial for clinical decision making. The data is often documented in unstructured texts and collected in the electronic health record. In this paper, the authors evaluated an approach to visualize information extracted from clinical documents by means of tag cloud. [1]

Abstract[1]

Background

Information is stored in an electronic health record in various data types, many of which are unstructured. As a result, clinicians may find it difficult to gain a holistic impression of a patient's current medical condition when data collected over the course of many years includes a significant amount of lab work, many diagnostic exams, numerous procedures, and even hospitalizations. Review of this data can require a significant amount of time if no substantive overview of all of the available data is present.

Objective

In this work, the authors address the question of representing the current status of a patient that is described in form of unstructured text to enable physicians to get an overview quickly. The authors introduce an approach that visualizes patient information extracted from the documents of the EHR in an easy understandable manner, namely by means of tag cloud visualization. Tag cloud is a common visualization method in the Web 2.0 community. Studies showed that they enhance the perception of (Web) documents [6] and they support an explorative search when it is difficult to specify a concrete query. These benefits perfectly address the challenges of accessing and monitoring patient data documented in unstructured documents. In this context, using tag cloud for information visualization is still relatively unexplored.

Study Design and Method

For the study, tags are selected from a text (1) just by frequency (Bag of words), or (2) based on their part of speech (POS). A first type of tags (Bag of words) is generated using all the words of a document except stop words. All the tags are rendered with same color.

To generate the second type of tags, the tokens of a text are annotated with their part of speech labels. The lexical category of the remaining words of a document is after removal of stop words. The authors intuitively highlight the nouns, verbs, and adjectives with three primary colors (red, yellow and blue) due to the decisive roles of these lexical categories in the meaning delivery. Their applicability and correctness of the authors' intuition from meaning representation need to be analyzed in experiments.

In order to figure out the effectiveness of the tag clouds, a user study has been performed in the neurosurgical department of a university hospital. Three residents were asked to assess the tag clouds generated for different texts and to judge whether (1) the tag clouds are useful to get an overview on the patient status, (2) shown words are relevant and (3) visualization of relevant aspects is clear. Six medical reports in German (three surgical operation reports, two pathological reports and one radiological report) were used to generate the tag clouds.

Before starting to answer the questions, the physicians were introduced to imagine the following simulated task scenario:

In the outpatient department, facing a patient who has never seen before, but he/she was treated in the hospital already, you have the complete patient documentation in computer. You are expected to grasp the basic status of this patient and start the treatment in very short period of time.

Based on this scenario, all six medical reports visualized by the two types of tag cloud (generated through bag of words approach or POS tagger) were judged on a rating scale from 1 (bad) to 5 (excellent).

Results

Since the medical narratives have shown significant differences in format and writing styles, we decided to perform a comparison based on different data sources, namely, operation notes, pathological reports and radiological reports. In order to obtain the arithmetic mean for one narrative type, the sum of judgments’ means (three evaluators) is divided by the amount of documents in this narrative type.

The pathological report has achieved the best scores in the first question (Usefulness), while the operation note reaches the highest mean value by the second question referring to the tag relevance. For the question three, the visualization of relevant aspects, the tag clouds have also provided moderate representation of details to the physicians, although only entity extraction approaches were applied to generate the clouds without considering the semantic meanings of terms.

Generally, evaluators stated to have obtained an overview on the reports through the tag clouds, but desired to see some more details on the medical condition of a patient.

For all three questions, different variances and deviations in the perceptions of the physicians have been analyzed. The main reason is their varying work experiences and knowledge background.

Conclusion

In conclusion, already with simple methods, perception of relevant aspects reported in clinical documents can be supported. Since the approach exploits only tokenization, part of speech tagging and stop word removal, texts of different languages can be easily processed using that method.

Several improvements are possible and would increase user satisfaction. In future, we will study whether tags generated using concept extraction will be a better option to identify most important aspects of a clinical document. Symptoms, diseases and anatomical concepts would be the most interesting information in the medical records and concepts belonging to those categories could be used for tag cloud generation. Moreover, the semantic relations between the concepts will also be presented to the users. Through clustering of tags, the potential topics will be detected and visualized.

Comments

Working in the acute care setting, I definitely see the application of the concepts associated with this article. Adequate summations of large amounts of clinically relevant yet unstructured data continue to frustrate and impede smooth provider workflow and timely clinical decision making. As new and increasing sources of healthcare information make their way into the electronic health record, it is imperative that studies such as this one find ways to teach the computer how to better fit into natural provider workflow rather than interrupting provider workflow in order to serve the EHR.

Related Articles

Optimization of drug–drug interaction alert rules in a pediatric hospital's electronic health record system using a visual analytics dashboard

Multi-label classification of chronically ill patients with bag of words and supervised dimensionality reduction algorithms

Medical decision support using machine learning for early detection of late onset neonatal sepsis


References

  1. 1.0 1.1 Deng Y, Denecke K. Visualizing unstructured patient data for assessing diagnostic and therapeutic history. Stud Health Technol Inform. 2014;205:1158-62. http://www-ncbi-nlm-nih-gov.ezproxyhost.library.tmc.edu/pubmed/25160371.