AI Company Exposed 2.5 Million Patient Records Over the Internet
The personal and health information of more than 2.5 million patients has been exposed online, according to technology and security consultant Jeremiah Fowler.
The records were discovered on July 7, 2020 in two folders that were publicly accessible over the Internet and required no passwords to access data. The folders were labeled as “staging data” and had been hosted by an artificial intelligence company called Cense AI, a company that provides SaaS-based intelligent process automation management solutions. The folders were hosted on the same IP address as the Cense website and could be accessed by removing the port from the IP address, which could be done by anyone with an Internet connection. The data could have been viewed, altered, or downloaded during the time it was accessible.
An analysis of the data suggests it was collected from insurance companies and relate to individuals who had been involved in automobile accidents and had been referred for treatment for neck and spinal injuries. The data was quite detailed and included patient names, addresses, dates of birth, policy numbers, claim numbers, diagnosis notes, payment records, date of accident, and other information. The majority of individuals in the data set appeared to come from New York. In total, there were 2,594,261 records exposed across the two folders.
Fowler identified extremely uncommon names and performed a Google search to verify those individuals were real, checking the name, region and demographic data. Fowler was satisfied that this was a real data set and not dummy data. Fowler made contact with Cense via email and while no response was received, the data was no longer accessible on July 8, 2020.
Get The Checklist
Free and Immediate Download
of HIPAA Compliance Checklist
Delivered via email so verify your email address is correct.
Your Privacy Respected
Fowler suspects that the data had been temporarily loaded into a storage repository prior to being loaded into Cense’s management or AI system. There was no way of determining how long the data had been exposed.
Currently, there is no breach notice on the Cense website and the incident has not appeared on the HHS’ Office for Civil Rights website. Fowler said he only accessed a limited amount of data for verification purposes and did not download any patient information; however, during the time the folders were exposed, it is possible that other individuals may have found and downloaded the data.
Data leaks such as this are all too common. Misconfigurations of cloud resources such as S3 buckets and Elasticsearch instances frequently leave sensitive data exposed. Cybercriminals are constantly searching for exposed data and it does not take long for data to be found. Once study conducted by Comparitech showed that it takes just a few hours for exposed Elasticsearch instances to be found.
Cloud services offer many advantages over on-premises solutions, but it is essential for protections to be put in place to secure any cloud data and for policies and procedures to be implemented to allow misconfigurations to be rapidly identified and corrected.