25% off all training courses Offer ends May 29, 2026
View HIPAA Courses
25% off all training courses
View HIPAA Courses
Offer ends May 29, 2026

The HIPAA Journal is the leading provider of HIPAA training, news, regulatory updates, and independent compliance advice.

CMS Restricting Access to Healthcare Datasets Will Cause Long Term Damage to Public Health

Earlier this month, the HHS’ Centers for Medicare and Medicaid Services (CMS) announced two significant changes to how it handles data access requests. Physical copies of data will no longer be available, and data can only be accessed via the CMS’s own virtual data center. Those wishing to access these data will also face higher fees for the privilege.

Given that these decisions come in the wake of recent data breaches at CMS contractors the changes may appear sensible. In 2022, up to 254,000 Medicare beneficiaries were put at risk when a subcontractor (Healthcare Management Solutions (HMS), LLC) was the victim of a ransomware attack. Again in 2023, another contractor was the target of a cyberattack, with over 600,000 Medicare beneficiaries having their data exposed after a vulnerability was exploited in Progress Software’s MOVEit Transfer solution.

By hosting data in their own virtual environment, they have greater control over who can access it and prevent further, unauthorized distribution of the data. Yet they embody an issue that has only grown in magnitude over time: how do we balance the risks of health data breaches with the freedom of access needed to generate advances in public health?

Healthcare Data in the United States

Healthcare provision generates near-unparalleled volumes of data. Every interaction between a patient and their provider is documented and fed into a larger management system, culminating in millions of individual records covering hundreds of millions of test results, diagnoses, and outcomes. The quality of the information being collected has also improved. Once, we relied on medical or diagnostic proxies for conditions, but technological advances have cut out the medical middleman and can lead us more directly to the source of ill health.

Yet America’s federalized and privatized healthcare system gives way to a fractured and complex health information landscape, with relatively few datasets spanning multiple care networks, let alone states. This greatly hinders public health research, which can only exist as a field if there is data from which to draw meaningful conclusions. Without sufficient data, researchers are unable to accurately assess trends, identify risk factors, develop effective interventions, or evaluate the impact of public health initiatives. Thus, the absence of data not only impedes progress in understanding and addressing health challenges but also undermines efforts to improve overall population health and well-being.

One commonly used dataset, the “Veterans Affairs” dataset, is used widely in epidemiological and medical information. Yet its use produces imperfect results. Women represent less than 20% of the patients, a significant deviation from what would be expected in the general population, which limits the generalizability of study results. Imbalances exist in the data for a range of other social and demographic factors known to impact health, or even in the prevalence of certain conditions: more veterans are amputees, for example.

Therein lies the value of the data held by the CMS. The CMS is the United States federal agency that administers all public health insurance, which covers approximately one-third of the entire US population. This more representative sample has been used by researchers in several high-profile projects, such as to support the development of the Affordable Care Act or to document racial disparities experienced by Medicare recipients, as well as innumerable academic projects. That this research has contributed a positive impact on public health is undeniable, but perhaps one of its biggest benefits is not measurable directly by academic papers. Scores of researchers were first exposed to the complexities and intricacies of public health research via the CMS dataset. As 375 academic researchers pointed out in a letter to Chiquita Brooks-LaSure, Administrator at the CMS, such fees “will disproportionately dissuade research by and training of scholars from disadvantaged backgrounds and institutions.” Losing such voices will only serve to entrench an already inequitable academic landscape.

The CMS dataset is a unique resource on the state of Americans’ health. No other dataset can come close to its complexity, and restricting its access will have immeasurable consequences for the future of public health research and leadership.

Concerns Over Data Privacy

None of this is to say that the CMS’s decisions were completely unfounded. With such vast data reserves come great privacy concerns. In the United States, the use and disclosure of medical data are governed by the Health Insurance Portability and Accountability Act of 1996 (HIPAA). The Act has a broad remit, but one of its central focuses is ensuring that a patient’s protected health information remains private, with strict guidelines on the exact nature of the data that can be distributed, as well as its permitted use. HIPAA violations arising from both known or unknown mismanagement incur strict penalties, highly incentivizing compliance. The Office for Civil Rights, which oversees HIPAA enforcement, has collected $129 million in fines for non-compliance to date. Though the average size of these fines is $11,000, they vary widely: the largest penalty ever paid for HIPAA non-compliance totaled $18 million, paid by Anthem Inc. in 2018. Though such a breach would have to be very large, even a fine of several thousand dollars would be enough to dissuade researchers whose resources are overseen by universities and funders alike.

Under HIPAA, any researchers in receipt of data from the CMS are responsible for its protection and must sign an agreement with the exact conditions of its secure management. Though the data handed over has been stripped of any personally identifiable information, it still contains sensitive information on diagnoses, progn

Author: Dr Rachel Murray-Watson is a Research Fellow at School of Public Health in the Faculty of Medicine in Imperial College, London, researching the projected impact of climate change on public health. She was previously a public health researcher at the Yale School of Public Health where her research focused on using large datasets to build machine learning models to predict antibiotic resistance infections and build clinical decision tools for healthcare practitioners. She completed her PhD at the Theoretical and Computation Epidemiology group at the University of Cambridge. She also holds a master's degree in Demography and Health from the London School of Hygiene and Tropical Medicine and an undergraduate degree from Imperial College London. You can contact Dr. Murray-Watson via LinkedIn

x

Is Your Organization HIPAA Compliant?

Find Out With Our Free HIPAA Compliance Checklist

Get Free Checklist