De-Identified Data

De-identified data is data that has been stripped of any personal (e.g., name) and quasi-identifiers (e.g., zip code) that could be used to determine the identity of the individual the data represents. The increasing adoption of health information technologies facilitates studies using large data sets coming from multiple sources. De-identification mitigates privacy risks to individuals, enabling theses secondary uses of the data.

The Health Insurance Portability and Accountability Act of 1996 (HIPAA) Privacy Rule protects most “individually identifiable health information” (known as protected health information or PHI) held or transmitted by covered entities (i.e., hospital, researcher, etc.). The Privacy Rule sets the standard for de-identification, stating that health information is not individually identifiable if it does not identify an individual and if the covered entity has no reasonable basis to believe it can be used to identify an individual.

Furthermore, the Privacy Rule provides two methods by which health information can be designated as de-identified. The first is the “Expert Determination” method, in which a person with appropriate knowledge of and experience with statistical and/or scientific methods determines that the risk is very small that the information could be used alone or with other available information to determine the identity of an individual who is the subject of the information. The second is the “Safe Harbour” method, which requires that 18 specific types of identifiers be removed from the information and additionally, that the covered entity has no knowledge that the data could be used alone or with other available information to individually identify an individual who is the subject of the information.

De-identified data may contain a re-identification code, which can be used to connect the information back to the individual.

De-Identified Data

Navigation menu

Personal tools

Namespaces

Variants

Views

Actions

Search

Navigation

Tools