OCR Director Leon Rodriguez said the role of his agency is to take more of a "macro" look at how breaches occur and what kind of risks and vulnerabilities led to them, rather than crunch and analyze large amounts of data.
Who has the responsibility?
Big Data analytics, Rodriguez said, is the responsibility of medical providers and/or their business associates who store and handle Protected Health Information (PHI). They are required to use certain safeguards to protect that information, and also to report breaches of 500 or more records to HHS and the media.
In the past, Rodriguez said, the main sources of information about violations were patients. "But they only have pinhole view of what's going on. What's changed is that we are now getting large-scale breach reports involving millions of records. We were never in that environment before. But it is good, because it comes at a time when more and more health data is being stored electronically and aggregated," he said.
Rodriguez said his agency needs the technical capability to understand what health providers and data custodians are doing, but, "we're really looking at your business process rather than what was in that data that was breached."
Still, even if some of the initial hype was overdone, Big Data has ever-expanding value.
What was considered Big two years ago would now be considered Medium, and in a few more years will be considered relatively insignificant. IBM notes that every day, "we create 2.5 quintillion bytes of data -- so much that 90 percent of the data in the world today has been created in the last two years alone"
Todd Marlin, writing on Ernst & Young's Forensic Brief blog, observed that, "Today, an hour's worth of business for a typical big-box retail chain can create millions of transactional records. The entirety of data from the private sector doubles every 14 months.
"Consider that when your organization leaves the league of petabytes in storage and moves to exabytes (that's about one thousand petabytes), you are then working at an organization that stores more data than the entirety of human civilization until about 20 years ago," he wrote.
Data where you didn't see it coming
It is not just a lot more of the same data that has been collected for generations either. It comes from sources that did not exist even a decade ago: sensors in everything from smart cars to smart appliances, TVs and weather stations; utility smart meters; health care biosensors that can monitor everything from heart rate to the effect of medications on the body; HVAC monitors; traffic sensors; ATM transactions; posts to social media sites; geotagged digital pictures and videos; purchase transaction records; cell phone GPS signals; clickstream; log files and more.
Sign up for Computerworld eNewsletters.