Data warehouse

From Clinfowiki
(Redirected from Clinical Data Warehousing)
Jump to: navigation, search

Health data warehouses are warehouses that collect data from clinical, financial, and ancillary data repositories, and put them into a central data warehouse. Clinical data warehousing as a subject-oriented, integrated, time-variant, nonvolatile collection of data in support of management decision-making [1].

Meanings of key terms

  1. Subject-Oriented: Organization of data in a warehouse is around the key subjects (or high-level entities) of the enterprise. For instance, patients, students, and products.
  2. Integrated: The data is assumed to be using consistent naming conventions, formats, encoding structures, and related characteristics for sharing and usability.
  3. Time—variant: Data contain a time dimension so that they can be used for historical purposes.
  4. Nonvolatile: Data are refreshed from operational data, and cannot be updated by users.

Considering the above key terms data warehousing could be defined as the process by which an organization extract meaningful information from historical data.


With the advent of electronic medical records (EMR) and other clinical data, most of the information about a patient’s hospital stay is recorded in electronic form. However, this information is often in very separate systems with little data querying or analysis tools for real time reports. If real time reports are available, they are geared to only one aspect such as total charges or days in accounts receivable.

In most healthcare systems, outpatient or inpatient, there is no accurate measure of physician performance and/or practice patterns. Billing and code data are often grossly inaccurate for judging or improving physician or group performance.

Enterprise Healthcare data warehousing

The key elements are the data warehouse has a different internal indexing and storage structure compared to the transaction based real time databases. A warehouse is designed to allow complex searching of data with a variety of key indexes.

Key parts of a data warehouse include developing the extraction translation and loading of the data warehouse.

Data in clinical real time systems must be periodically uploaded to the data warehouse. During this process, all data needed for analysis is extracted in its raw form. Next, the data is translated into a form that is appropriate for searching and indexing. Then the data is loaded into the data warehouse and indexed across many parameters for fast search and retrieval. There are a couple of different structures of the data warehouse from the start scheme to OLAP blocks.

Next is building the query and reporting capabilities. The option for this is plentiful.

A specific program such as Business Objects can be used which has predefined fields and allows for routine reports and then customizable on the fly reports. This allows for physicians to design performance reports that can compile data from the clinical and billing data to provide a more complete picture of a group and physician’s practice partners allowing for education and practice improvement projects. Other querying tools such as SAS and SPSS can build connect ions that allow for more complex statistical analysis.


Health Data Warehouses allows for patient care related research and development of disease and risk stratification models for disease states. With these systems, there is a possibility of multi-institutional data warehouses that increase the research capacity from a larger patient base, possibly allowing for a mixture of academic and community hospitals making the data more generalizable with better mixture of severity levels.


For many years, several companies have been dropping data everywhere, yet there was no knowledge on how to mine them to find anything meaningful. During this time, businesses have been backing up and archiving those data without knowing what to do with them. Those data were merely kept there for historical purposes. Lately though, business intelligent (BI) tools like OLAP from Business Object, Oracle, SAS and Microsoft among others are helping to transform those raw data into smart information. However, healthcare industries have not been taking the advantage of the benefits of the BI tools.


McFadden et al (p. 531) noted that data warehousing came to light as a result of advancement in information systems technology over several decades. Some of these advances are as illustrated below:

  • Improvement in database management technology, especially in relational database management systems (RDBMS) and data modeling.
  • Improvement in computer hardware, especially with respect to mass storage and parallel computer architectures
  • The emergence of intuitive computer interfaces and tools
  • Advancement in middleware products that ensure database connectivity across various platforms.

The discovery that led to the development of data warehousing was, understanding the difference between operational (or production) systems and informational (or decision support) systems.

Purpose of data warehousing

Two major factors driving the need for data warehousing in most organizations today as:

  1. A business requirement that needs to integrate company-wide of high quality information.
  2. Separation of informational (historical) from operational data

The two factors mentioned above will be subsequently considered.

Separation of data

For any clinical organization today, it is essential to separate operational data from informational data by creating a data warehouse.

  1. A data warehouse centralizes data (at least logically) that are scattered throughout disparate operational systems and makes them readily available for decision support applications.
  2. A properly designed data warehouse adds value to data by improving their data quality and consistency.
  3. A separate data warehouse eliminates much of the contention for resources that results when informational application are confounded with operational processes.

A few decades ago, physicians knew pretty much about all there is to know about medicine; most doctors could recollect the names of their patients. However, today, no doctor can keep up with the explosion of medical and health information. While health care organizations have recognized the use of computer in other industries, its application in healthcare have not been encouraging. This is because, among other factors, it takes too long to get information in many cases, there are no easy accessibility to data, and no uniform standard among various vendors. However, according to McGee, some healthcare organizations like Parners Health, which is a conglomerate of several Boston-area hospitals (Massachusetts General and Brigham and Women’s, etc.) have used iLog decision support and EMC Documentum content-management software to share clinical best practices for some time.

Although BI technology tools in use today still have limits, its future applications and the resulting breakthroughs in medicine are forgone conclusions

Submitted by: Gbenga Abimbola

Related Articles


  1. Himmelsbach, Vawn, “How business Intelligence is making healthcare smatter”
  2. McFadden, Fred R. et al: Modern Database Management: Basic Concepts of Data Warehousing, Addison-Wesley, New York: 1993
  3. McGee, Marianne Kolbasuk: A pill, A Scapel, A Database. Information Week 2006, 1,076: 39-45
  4. “Data Warehouse”
  5. McGee, Marianne Kolbasuk: A pill, A Scapel, A Database. Information Week 2006, 1,076: 39-45
  6. Whitten, Jeffrey L et al: System Analysis And Design Methods: Database Design,