Extensible Markup Language (XML)

From Clinfowiki
Revision as of 05:50, 6 September 2014 by Annathehybrid (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

The eXtensible Markup Language (XML) is a formatting language that allow the creation of text documents which are both computer- and human-interpretable. The standards for XML documents are created and maintained by the World Wide Web Consortium (W3C) [1], and describe the rules for marking up data. Similar to HTML, the use of tags allows the document creator to place additional meaning around data. In the case of XML, the tags are not pre-defined, so the document creator is free to select tag names that describe what the tag contains.

Example

As an example, we will look at a simple XML document that shows some basic demographics for a fictional patient:

  <Patient>
     <ID>1234567</ID>
     <Name>
        <Prefix>Mrs</Prefix>
        <First>Jane</First>
        <Last>Doe</Last>
        <Suffix/>
     </Name>
     <DOB>1/10/1970</DOB>
     <Gender>Female</Gender>
     <PhoneNumbers>
        <Phone type="Home">555-555-5555</Phone>
        <Phone type="Cell">555-555-5556</Phone>
     </PhoneNumbers>
  </Patient>

In the above example, <Patient> is a tag. You will notice that almost all of the tags come in pairs (i.e. <ID> </ID>), with the exception of <Suffix />. Because there is no suffix data, this tag is empty. Empty tags can be displayed as <Suffix></Suffix> or <Suffix />. Also, XML allows attributes which can provide some additional information for a tag. In the above example, the Phone tags have an attribute named "type" which describe the type of phone number displayed.

In addition, note that the tag names and attributes used are very descriptive of the data contained within, unlike HTML where tags provide information on how the data should appear on a web page. This is because XML is meant to separate the data from display, allowing it to be used to provide descriptive information about the nature of the data, as opposed to how it should look.

Storage of medical information

XML can be a very useful method for the storage of medical information. While more traditional relational database management systems (RDBMSs) are used to store discrete data elements, it's possible to use XML documents or XML databases [2] instead. Not only does this allow for the collection and storage of specific medical data fields as you might in a RDBMS, it provides additional flexibility to extend the document and add new types of data, even including digital signatures [3]. Using other technology, such as schemas, it's also possible to do validation on XML documents to ensure that parts or all of the document is structured in a certain way. If we imagine the example above represents a pre-defined standard for storing demographic data, it can be extended to include additional information that a specific institution may want to collect:

  <Patient>
     <ID>1234567</ID>
     <Name>
        <Prefix>Mrs</Prefix>
        <First>Jane</First>
        <Last>Doe</Last>
        <Suffix/>
        <PreferredName>Janey</PreferredName>
     </Name>
     <DOB>1/10/1970</DOB>
     <Gender>Female</Gender>
     <PhoneNumbers>
        <Phone type="Home" preferred="true">555-555-5555</Phone>
        <Phone type="Cell">555-555-5556</Phone>
     </PhoneNumbers>
  </Patient>

This document adheres to the fictional standard because all of the original tags and data fields exist, and it now collects personal preferences about how to contact this patient and how to address her.

In addition to storage, XML is able to facilitate the exchange of medical information. While any standard for data exchange requires some agreed upon rules for how the data is organized, some older binary formats often relied on information to be stored in a particular order and constrained you to a certain length for your data. XML allows some flexibility in the order that data fields are organized, as well as the ability to accept data of any length. XML data can be even be created and sent over a standard internet connection and read by a web browser, unlike proprietary binary formats which more often required special server and client programs to be running to handle data transfers. The use of XML over web services is taking root in healthcare through systems being developed using Service-Oriented Architecture. More information about this can be found in the article Introduction to Service-Oriented Architecture in Healthcare.

There are several different XML-based standards in healthcare, such as the HL7 v3 messaging standard [4] for exchanging medical information, as well as the HL7 CDA [5] and the ASTM CCR [6] standards to store and exchange clinical information, all of which benefit from the power and flexibility of XML. In addition, Natural Language Processing (NLP) systems such as MedLEE [7] and HITEx [8] extract information from free-text notes and can store the output in XML


References

  1. W3C XML [1]
  2. XML databases [2]
  3. XML Digital Signature standard [3]
  4. HL7 v3 Messaging Standards [4]
  5. Clinical Document Architecture (CDA) [5]
  6. Continuity of Care Record (CCR) [6]
  7. Friedman C. A broad-coverage natural language processing system. Proc AMIA Symp. 2000:270-4. [7]
  8. Zeng QT, Goryachev S, Weiss S, Sordo M, Murphy SN, Lazarus R. Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system. BMC Med Inform Decis Mak. 2006; 6:30. [8]