Unified Medical Language System (UMLS)

From Clinfowiki
Jump to: navigation, search

Unified Medical Language System (UMLS) is a major medical dictionary which links the major international terminologies into a common structure, allowing for efficient translation and interoperability.


The Unified Medical Language System (UMLS) integrates and distributes different medical vocabularies, coding standards and associated resources into a unified vocabulary to enable interoperability.

The UMLS project started in 1986 at the National Library of Medicine to provide uniformity in clinical concepts and enhance the retrieval and exchange of computerized health information between coding systems and applications. The current architecture of the UMLS includes three knowledge resources; Metathesaurus, Semantic Network, SPECIALIST Lexicon and Lexical Tools.

Over the years many vocabularies have been developed to address specific needs of groups within the health care system and the biomedical industry. Examples of the many controlled languages included in the UMLS are; ICD-10 for International Classification of Diseases and related health problems, CPT-4 for procedure and supplies coding, MeSH for literature indexing, Systematized Nomenclature Of Medicine (SNOMED) a comprehensive multilingual medical vocabulary and many others.

The UMLS groups the different vocabularies under common concepts and defines relationships between these concepts and the terms within concepts. Vocabularies sometimes fall into more than one category and there are numerous ways to categorize them, some examples include; disease, diagnosis, nursing, drugs, genetics, anatomy, procedure and supplies. The UMLS includes an ever-growing number of vocabularies from different sources and is updated two times a year.

UMLS knowledge sources

The UMLS has three knowledge sources that data are compiled into machine readable files. The Metathesaurus is the result of combining the different vocabularies (concepts). The Semantic Network defines 133 broad categories and 54 relationships for these categories. Every concept in the Metathesaurus is given one or more type. This helps to develop relationships between concepts in the Metathesaurus. The third knowledge source is the SPECIALIST lexicon. This is a lexicon of common English words and words accruing in the biomedical field. The SPECIALIST lexicon is used for natural language processing by the SPECIALIST natural language processing system (NLP). This lexicon also comes with a group of programs, which help with full text processing. These tools are capable of using free text, HTML and Medline abstracts as input and have a variety of outputs ranging from individual words, terms, multiword terms, phrases, sentences, and sections. There are also tools to normalize words and reorder phrases to help with indexing and matching. Also, tools are available to map text to UMLS concepts and concepts to text. These tools are available with a web-based interface, command line programs and Java APIs. The UMLS and its tools are available from the National Library of Medicine at no cost for developers and researchers after signing a license agreement.

  • UMLS Metathesaurus

Creates include organization by concept, or meaning. Links similar meaning and same concepts from multiple vocabularies, the metathesaurus is the largest component of the UMLS. This is a multi platfrom tool that can be installed locally on your computer and is available at NLM MetamorphoSys.

  • UMLS Semantic Network

Functions to provide a set of useful and important relationships (Semantic Relations), to reduce complexity that exist between Semantic Types. Categories provide consistent categorization of all concepts represented in the UMLS Metathesaurus. Sample records that illustrate structure and content of these files can be found at the UMLS Semantic Network


This tool is a general English lexicon that provides lexical information necessary for the SPECIALIST Natural Language Processing System (NLP) for many biomedical terms. Lexicon entries include the syntactic, morphological, and orthographic information needed by the SPECIALIST NLP System to normalize strings, generate lexical variants, and indexes.

List of controlled vocabularies as of Nov 2019


Sinha P., Sunder G., Bendale P., Mantri M, & Dande A. Electronic Health Record: Standards, Coding Systems, Frameworks, and Infrastructures. The Institute of Electrical and Electronics Engineers, Inc. by John Wiley & Sons, Inc. Published 2013. CHAPTER 16 PAGES 161-168

National Library of medicine (NIH). UMLS Metathesaurus Vocabulary Documentation. last updated November 2019. Taken 1/15/2020 from; https://www.nlm.nih.gov/research/umls/sourcereleasedocs/index.html

McCray AT, Burgun A, Bodenreider O. Aggregating UMLS semantic types for reducing conceptual complexity. Stud Health Technol Inform. 2001;84(Pt 1):216–220.

updates of page

Update of page submitted by (Lace Velk)