A comparison of two Detailed Clinical Model representations: FHIR and CDA
Review of
Smits, M., Kramer, E., Harthoorn, M., & Cornet, R. (2015). A comparison of two Detailed Clinical Model representations: FHIR and CDA. European Journal for Biomedical Informatics, 11(2), 7-17.
Available from http://www.ejbi.org/en/ejbi/artinfo/197-en-27.html
Contents
Introduction
Context
Alteration or loss of data in healthcare can affect patient safety; billing can also be impacted. Uniformity of clinical documentation supports data exchange, clinical workflow, quality, and research.
Academic hospitals in the Netherlands collaborated on "national cross-specialty exchange dataset"[1], nicknamed "GenOGeg" to exchange records amongst themselves. The collaboration selected the Detailed Clinical Model (DCM) paradigm to represent clinical data, with plans to implement the DCMs using HL7 v3's Clinical Document Architecture (CDA).
DCMs can also be represented in FHIR, but as number of standards increases, so does demand for data interchange between standards. For GenOGeg and other such projects, it's important to know whether future standards such as FHIR will be able to accurately exchange data with existing data models.
Research
The research behind this paper is meant to investigate whether DCMs can be represented with full fidelity in different implementations, to identify problems with "conceptual analysis"[1] of DCMs, and whether the data transformation itself adds problems.
Background
Detailed clinical models
DCM is an ISO draft standard methodology for describing "technology-agnostic"[1] data models and model descriptions. DCMs are used to define structure for data in EHRs, messaging, and CDS; and to define how clinical information models correspond to clinical terminologies. Each DCM corresponds to a specific clinical concept in a standardized and readable way.
HL7v3 CDA release 2
Clinical Document Architecture (CDA) is a medical document definition standard from HL7; it is part of HL7v3. CDA is an XML standard specifying document metadata and optionally incorporating non-XML content in the document body. It also allows for document signing, context definition, and readability. The model is simple and not domain-specific, so unlike much of HL7v3 it has been widely adopted.
FHIR
FHIR is an HL7 draft standard currently under active development. It incorporates much of the conceptual underpinnings of previous HL7 standards but is meant to be more robust than HL7v2 and easier to implement than HL7v3. It works on its own as a data exchange standard, but can also be combined with other data representation formats.
FHIR objects are called "resources"; resources are used to build documents and messages. They are used in a number of ways within FHIR, depending on their type; there are resources for clinical content, but also, for example, resources used to establish data interchange specifications for a document. All resources are defined in a like manner, using data types to define "patterns of elements". All resources also have a "common set of metadata" and a "human-readable part"[1].
FHIR's built-in resources are meant to cover a majority of use cases, but the facility exists to extend FHIR to adapt it to local or specialized use cases.
Methods
The author notes that the "technology-agnostic" DCM standard is supposed to support intercompatibility between implementations of a given DCM. This research is designed to test that claim by applying XSLT to transform GenOGeg documents built using CDA into their FHIR equivalent documents.
Creating example messages
GenOGeg was used specifically because it is standard among a number of institutions. The study was done using message types thought to have a higher risk of data translation problems. The authors tried to fit the GenOGeg DCM specifications into CDA and FHIR models using the Consolidated Clinical Document Architecture (C-CDA) implementation standard (described as the American equivalent to GenOGeg), as GenOGeg's implementation guidelines were still under development. Example messages from the FHIR specification were used as templates in constructing FHIR messages for the experiment, modified so that they matched DCM rules, then validated against the FHIR XML schema specification.
Comparing example messages to identify discrepancies
During the experiment, the authors kept track of problems with the data representation, noting per step of the process whether the representation was still compliant with the DCM. They defined categories (each described in the results section below) for the problems found.
Transforming the CDA example messages to FHIR
The XSLT language was used to transform CDA messages to FHIR format. XSLT is specifically for transforming data between different XML formats (such as CDA and FHIR), so it was the natural choice for the job.
Results
The categories of problems found with the data were defined as "coded values", "different relational structure/hierarchies", "requirements and restrictions", "narratives", "null flavors and negation indicators", and "meaning of attributes"[1].
Coded values
Coded values differed in their use between DCM, CDA, and FHIR, so there was not always a one-to-one mapping available for given values. For example FHIR in some cases required data for these values that was not provided by CDA. C-CDA rules allow for some data to be stored as plain text whereas FHIR requires encoded values, so there is no direct way to translate the values without additional intervention.
Different relational structure/hierarchies
In some DCMs the hierarchy of data is defined differently than the corresponding message in FHIR. The authors give the example of Alert messages: in FHIR they are standalone objects, but in DCMs they are parts of other objects. The Alert message in FHIR is also conceptually different from that in DCMs, so they can't be mapped directly one to the other. This means that the claim of CDM to be "technology-agnostic" is not valid here. In cases where CDA hierarchies exist without corresponding FHIR representations, FHIR would need to be extended to allow for complete coding of the values from CDA. The question of representation persists, however, in the case of conceptual differences.
Requirements and restrictions
Many FHIR resources possess required fields where the corresponding CDA messages do not; this can result in transformation failures in cases where values are not present in a CDA message being transformed. Conversely, FHIR has some restrictions on
Narratives
Narratives in CDA and FHIR can contain references to coded document elements; in the case that the coded element must be split into multiple resources in FHIR, the in-narrative reference to the coded element can no longer be valid without a rational way to refer to both resources as one entity. Such a method is not defined in FHIR.
Narrative encapsulation syntax is not readily convertible between CDA and FHIR.
Null flavors and negation indicators
"Null flavor" refers to the possibility of having different reason codes for establishing null values in a data set; these are common in healthcare due to missing data, inconclusive results, and so forth. Negation indicators are flags used to denote that a value is negative, or it negates what the corresponding value field indicates. CDA allows null flavors and negation indicators by default, but they must be explicitly allowed in FHIR. FHIR messages would need additional non-specified attributes to encode negation and null flavors.
Meaning of attributes
Some ambiguity exists in the defined meanings of various attributes in CDA messages. Some messages require datetime fields but lack an explicit definition for what the timestamp should represent. Therefore accurately mapping such a message to FHIR might be impossible.
Discussion
The authors note that these problems do not occur in most cases of remapping messages between CDA and FHIR. Some "fundamental" incompatibilities exist, though, causing the problems as categorized above[1]. Each of these problems represents a loss or potential misrepresentation of information from the original record. The result also demonstrates that the DCM standard's claim to be "technology-agnostic" may be deemed inaccurate.
This experiment involved transforming messages only from CDA to FHIR, but the authors assert that the "bidirectional analysis" performed guarantees no additional fundamental conversion issues are possible.
The authors state that their research has uncovered incompatibilities between CDA and FHIR that were previously unrecognized, and that prior work had not identified problems at the levels revealed in this research. They find "several fundamental and unresolvable problems"[1]. Since they experimented with real-world datasets, they take their results to be representative of problems that will come up in future production implementation attempts, and therefore encourage the developers of FHIR and other data representation standards to take their findings into account in further development. They also note that the findings of this study could be incorporated directly into the FHIR development process, since FHIR is still a draft standard. They suggest that similar research to their study could also be done with other data representation formats to isolate additional problems of the types they found, and that the DCM standard should be evaluated for possible revision to allow for better data exchange.
Conclusions
The authors conclude that representations of a DCM can vary in their fidelity to the intended meaning of the dataset, and that CDA and FHIR are not completely compatible with each other or DCMs.
Reviewer's impressions
This paper is a fascinating read; the results of this research have potentially far-ranging consequences for the field of Health Informatics, for vendors and institutions looking to implement new standards, and potentially for patients. The authors' methods and analysis seem robust to me, if limited in scope by the nature of the experiment; their call for further research addresses this, though, and is notable for its foresight and understanding of the breadth of the implications of their results.
Upon additional reflection, it seems to me that the mapping issues the authors encountered are at least in part manifestations of data normalization problems. For example, the fact that FHIR Alert resources are separate and can be referred to by another message but CDA messages embed the alert in other documents suggests that CDA documents should be built to separate the alerts from the other content and incorporated by reference, such as would be done in a Relational database management system (RDBMS).