Unified Medical Language System (UMLS)

From Clinfowiki
Revision as of 05:15, 16 April 2020 by Lvuv (Talk | contribs)

Jump to: navigation, search

Unified Medical Language System (UMLS) is a major medical dictionary which links the major international terminologies into a common structure, allowing for efficient translation and interoperability.

Introduction

The Unified Medical Language System (UMLS) integrates and distributes different medical vocabularies and coding standards and associated resources into a unified vocabulary to enable interoperability.

The UMLS project started in 1986 at the National Library of Medicine to provide uniformity in clinical concepts and enhance the retrieval and exchange of computerized health information between coding systems and applications. The current architecture of the UMLS includes three knowledge resources; Metathesaurus, Semantic Network, SPECIALIST Lexicon and Lexical Tools.

Over the years many vocabularies have been developed to address specific needs of groups within the health care system. Examples include ICD-10 for International Classification of Diseases and related health problems, CPT-4 for procedure and supplies coding, MeSH for literature indexing, Systematized Nomenclature Of Medicine (SNOMED) a comprehensive multilingual medical vocabulary and many others.

The UMLS groups the different vocabularies under common concepts and defines relationships between these concepts and the terms within concepts. Vocabularies sometimes fall into more than one category and there are numerous ways to categorize them, some examples include; disease, diagnosis, nursing, drugs, genetics, anatomy, procedure and supplies. The UMLS includes an ever-growing number of vocabularies from different sources and is updated two times a year.

Knowledge sources

The UMLS has three knowledge sources that data are compiled into machine readable files. The Metathesaurus is the result of combining the different vocabularies (concepts). The Symantic Network defines 133 broad categories and 54 relationships for these categories. Every concept in the Metathesaurus is given one or more type. This helps to develop relationships between concepts in the Metathesaurus. The third knowledge source is the SPECIALIST lexicon. This is a lexicon of common English words and words accruing in the biomedical field. The SPECIALIST lexicon is used for natural language processing by the SPECIALIST natural language processing system (NLP). This lexicon also comes with a group of programs, which help with full text processing. These tools are capable of using free text, HTML and Medline abstracts as input and have a variety of outputs ranging from individual words, terms, multiword terms, phrases, sentences, and sections. There are also tools to normalize words and reorder phrases to help with indexing and matching. Also, tools are available to map text to UMLS concepts and concepts to text. These tools are available with a web-based interface, command line programs and Java APIs.

The UMLS and its tools are available from the National Library of Medicine at no cost for developers and researchers after signing a license agreement.

List of controlled vocabularies