De-identification of Protected Health Information: How to Anonymize PHI

Healthcare organizations and their business associates that want to share protected health information must do so in accordance with the HIPAA Privacy Rule, which limits the possible uses and disclosures of PHI, but de-identification of protected health information means HIPAA Privacy Rule restrictions no longer apply.

HIPAA Privacy Rule restrictions only covers individually identifiable protected health information. If you de-identify PHI so that the identity of individuals cannot be determined, and re-identification of individuals is not possible, PHI can be freely shared.

The de-identification of protected health information enables HIPAA covered entities to share health data for large-scale medical research studies, policy assessments, comparative effectiveness studies, and other studies and assessments without violating the privacy of patients or requiring authorizations to be obtained from each patient prior to data being disclosed.

HIPAA-Compliant De-identification of Protected Health Information

HIPAA-compliant de-identification of protected health information is possible using two methods: Safe Harbor and Expert Determination. Neither method of de-identification of protected health information will remove all risk of re-identification of patients, but both methods will reduce risk to a very low and acceptable level. Use either of the two methods below and PHI will no longer be considered ‘protected health information’ and will therefore not be subject to HIPAA Privacy Rule restrictions.

1.     Safe Harbor – The Removal of Specific Identifiers

The first HIPAA compliant way to de-identify protected health information is to remove specific identifiers from the data set. The identifiable data that must be removed are:

  • Names
  • Geographic subdivisions smaller than a state
  • All elements of dates (except year) related to an individual (including admission and discharge dates, birthdate, date of death, all ages over 89 years old, and elements of dates (including year) that are indicative of age)
  • Telephone, cellphone, and fax numbers
  • Email addresses
  • IP addresses
  • Social Security numbers
  • Medical record numbers
  • Health plan beneficiary numbers
  • Device identifiers and serial numbers
  • Certificate/license numbers
  • Account numbers
  • Vehicle identifiers and serial numbers including license plates
  • Website URLs
  • Full face photos and comparable images
  • Biometric identifiers (including finger and voice prints)
  • Any unique identifying numbers, characteristics or codes

In the case of zip codes, covered entities are permitted to use the first three digits provided the geographic unit formed by combining those first three digits contains more than 20,000 individuals. When that geographical unit contains fewer than 20,000 individuals it should be changed to 000. According to the Bureau of the Census, that means 17 zip codes must have the first three digits changed to zero:

036, 692, 878, 059, 790, 879, 063, 821, 884, 102, 823, 890, 203, 830, 893, 556, 831

Covered entities should not that the above list of zip codes may change after future censuses. The list is based on 5-digit zip codes from the 2000 census.

For further information on de-identification of protected health information using the safe harbor method see 45 CFR § 164.514(b)(2).

2. Expert Determination

The expert determination method carries a small risk that an individual could be identified, although the risk is so low that it meets HIPAA Privacy Rule requirements.

This method of de-identification of protected health information requires a HIPAA covered entity or business associate to obtain an opinion from a qualified statistical expert that the risk of re-identifying an individual from the data set is very small. In such cases, the methods used to make that determination and justification of the expert’s opinion must be documented and retained by the covered entity or business associate and made available to regulators in the event of an audit or investigation.

The expert must be a person with appropriate knowledge and experience of using generally accepted statistical and scientific principles and methods for removing or altering information to ensure that it is no longer individually identifiable.

When those methods and principles have been applied, the expert must determine that the risk of reidentification of an individual is very small. In such cases, the risk of reidentification must be very small when the information is used alone, and must remain very small should the data be combined with other reasonably available information by an anticipated recipient to identify an individual who is a subject of the information.

HIPAA does not define the level of risk of re-identification other than to say it should be ‘very small’. The expert should define ‘very small’ in relation to the context of the data set, the specific environment, and the ability of an anticipated recipient to be able to reidentify individuals.

Experts may come from a number of different fields and do not require any specific qualifications. What is important is experts have experience of deidentifying data. It is that experience that regulators will look at in the event of an audit, not specific qualifications or certifications.

For further information on de-identification of protected health information by expert determination see 45 CFR § 164.514(b)(1).

The U.S. Department of Health and Human Services’ Office for Civil Rights has issued guidance on de-identification of protected health information which can be viewed on this link.

Author: Steve Alder has many years of experience as a journalist, and comes from a background in market research. He is a specialist on legal and regulatory affairs, and has several years of experience writing about HIPAA. Steve holds a B.Sc. from the University of Liverpool.