Image credit: Flickr/purpleslog
Since it passed into law in 1996, the Health Insurance Portability and Accountability Act (or HIPAA) is one of the most talked-about, least-enforced federal regulations around. Indeed, since the law went into force in 2003, the Department of Health and Human Services (HHS) has received over 76,464 HIPAA complaints. But more than half - 42,366 - were deemed ineligible for enforcement and dismissed, according to HHS data. Another ten percent (8,847) were investigated and dismissed because no violation occurred. Fewer than one in four: 23 percent (or 18,328) were investigated and resulted in an enforcement action against a covered entity. As for enforcement actions resulting in monetary penalties? Just 11 for the 16 year-old law - %.01 of all reported cases sent to HHS led to some kind of fine against a hospital, insurer or other covered entity.
Furthermore, many of the cases that have resulted in investigations and civil penalties involved the kind of flagrant violations that are easy to spot. For example, two of the 11 cases involving civil penalties concerned health care staff accessing medical records of celebrities. One case involved a hospital employee leaving 192 records containing patient protected health information on a Boston subway car.
What's the reason for HIPAA's lopsided enforcement record? Dr. Anupam Datta, an Assistant Research Professor at Carnegie Mellon University's CyLab and Department of Electrical & Computer Engineering, argues that laws like HIPAA and the financial services industry's Gramm-Leach-Bliley Act are so complex that -- in all but the most flagrant cases -- it is difficult for either regulators or covered entities to know for sure whether violations of the laws have occurred. As an example, Datta notes that HIPAA contains about 85 clauses that explicitly list conditions under which personal health information (PHI) can be transmitted by a healthcare organization to external entities given patient consent - or even without it. But getting hospital staff to audit compliance with each of those 85 conditions - some of which take precedence over others - is a monumental task.
What's needed, Datta argues, is a way to automate compliance with these complex data privacy laws by harnessing IT to perform audits. To accomplish this, he and a team of researchers have developed a new language to formalize compliance rules, as well as algorithms that can determine whether covered organizations have disclosed personal information about their customers to third parties in compliance with, or in violation of privacy regulations.
Datta spoke by phone with ITworld about his work, which he sees as a way to address the immense privacy implications of technologies like cloud computing and social networking.
ITworld: Tell us a little about the work you're doing at CyLab.
Anupam Datta: We have been thinking over the last four or five years about this problem where if you think about the way the world is going, we are giving away all our personal information to lots and lots of companies, healthcare organizations, financial institutions, companies like Google and Facebook and so forth.
In certain sectors we have seen privacy regulations emerge. For healthcare there is the HIPAA privacy rule. For financial institutions there is the Gramm-Leach-Bliley Act. For companies like Google and Facebook - the web services companies - they are not regulated, but if they do make some promises in their privacy policies, then it's expected that they will respect those promises and how they subsequently use and share personal information and if they don't, then there has been several instances where the FCC has taken them to court and there have been a number of recent incidents of that form.
So the question was: There are all these privacy policies and laws and corporate privacy policies that these companies are promising to respect. How can we develop computer-based systems and tools that could potentially help these companies ensure that they are being compliant with the regulations with their internal privacy policies? We looked at the HIPAA privacy rule and the Gramm-Leach-Bliley Act in complete detail, and we put these laws into a computer language so that we could develop algorithms that could take the computer-based representation of these laws as input, along with the audit law in a healthcare organization, say, and check if the actions that were done that involved accessing personal health information is in fact compliant with the law.
ITworld: You talk about formalizing these laws - by which you mean expressing them using a formal logic along the lines of what most computer coding or scripting languages follow, is that right?
Anupam Datta: Yes. It's a computer language in which we represent the laws. It's a little bit different from scripting languages. It's a more declarative specification. Particular languages are more -- they're a specification of what is permitted and what is not to be done -there's an explicit description of exactly how to do it.
Let me give you one example. So, a perfect example of the HIPAA would be that 'in general it's okay to share personal health information about a patient with the patient herself.' That's true, except for psychotherapy notes. So, 'for psychotherapy notes in addition, we need the permission of the psychotherapist before the disclosure can be made to the patient.' So that's the problem.
Now we will put that into a computer language, and then when we check for compliance, suppose that the disclosure of psychotherapy notes was made from the hospital to a patient. Then we have to check if, prior to that disclosure, there is an authorization that was issued by the psychotherapist permitting that disclosure.
The exact form in which the interaction happened is not important. What's important is that before receiving the disclosure from the hospital to the patient, the psychotherapist explicitly gave permission. For all other kinds of personal health information when they're shared with the patient, that's perfectly fine, but this special case has to be dealt with by checking that this additional authorization was obtained prior to disclosure. And you could check these kinds of things if the algorithm had access to audit logs that are recording things like patient consent, authorization from psychotherapist for disclosure, things of that nature.
ITworld: Is this language akin to like an XML, a descriptive language where you're talking about just really tagging what is in essence textual data or policies just with particular types of reference or is it a unique language that you guys have developed just for this purpose?
Anupam Datta: At a tactical level, it's a fragment of first order logic. This is the kind of language that's used quite a bit now a days in the hardware industry. So when people build hardware systems they want to check that their hardware systems satisfy certain kinds of specifications about what the hardware should be doing. Then they use these kinds of logical descriptions for specifying what the hardware should be doing. So that's one area these particular languages have been used.
Another area in which similar languages are increasingly being used is in checking software for compliance. So in companies like Microsoft there is a lot of work on specifying properties of software using various fragments of first order logic. Then algorithms for checking that the programs behave as expected.
So this is very similar to those kinds of work where now, instead of checking hardware or software, we are trying to check that the activities inside an organization which involve people doing various types of action over personal information are compliant with these kinds of policies.
ITworld: Obviously, health care organizations already have applications that help them manage patient records, and keep track of treatment and compliance-related issues. How would the technology you're talking about differ from what already exists?
Anupam Datta: The way in which we are viewing this will be used in practice is that it'll be used by auditors. Because of HIPAA and HITEC, hospitals have to maintain audit logs of who has access to what information and also what information has been shared with external entities.
Now in many hospitals, they're beginning to use audit tools like FairWarning or P-to-P Sentinel ... Hospitals have these big audit logs where they record which employee has access to what information at what time and what updates that they have done to the medical record and things like that. These tools are largely used by some designated people inside the hospitals to ensure that the information is not being accessed inappropriately.
So at a particular hospital there might have been a celebrity being treated and there was a lot of access to the celebrity's health record. Then that will get flagged by the tools and then the auditor in the hospital will go and talk to the person or will tell the manager of the employee to go have a conversation with them about why they accessed this kind of information. But not much is being done to check whether the external disclosures from the hospital to external entities are in compliance with policies like HIPAA.
ITworld: Why is that?
Anupam Datta: It's challenging for two reasons. One is that the policies are very complex and existing languages do not have the expressivity to represent these policies in a form. You asked about XML ... There is something called XACML and that's an industry standard for specifying access control policies. That would be the nearest that would come to our language. But XACML doesn't have support for a bunch of concepts that arrive in the policies. For example, temporal. They might say 'if, in the past, the patient gave consent, then it is okay to share information.' Or policies might say things like 'if a data breach happens in an organization that, within the next 30 days, patients should be notified about that.'
These are these temporal constraints which impose conditions on things that should have happened in the past for that disclosure to be permitted or impose these complex future obligations requiring the organization to do something. These kinds of temporal concepts cannot be expressed in a meaningful way in existing access control languages like XACML.
ITworld: The challenge here is that at the end of the day these are representations of laws that are made by human beings and those laws and policies change all the time and obviously just within the last year we've seen the passage of the Affordable Care Act, which will radically shift the healthcare landscape including the policies and regulations by which all these healthcare entities will operate.
So presumably that would, at the end of the day, filter down to changes within the language itself to adjust to that. Are you trying to in some ways describe a moving target?
Anupam Datta: That's actually a very good question. That's part of the reason. The way I view the question [is that] there is the language and then there is the particular law or policy that we represent in the language. Now we have designed the language to be very, very general so that even when the laws change, the new law could still be represented in the language. Now the audit algorithm since it works for all policies in the language, the audit technology does not have to change.
The goal here was to develop the audit technology in a very general way for a very general language so that even if individual laws change, the new law will simply represented in the language and, therefore, the technology will still be able to do what it was designed to do.
So, yes, if HIPAA changes or some other state law changes, then we will have to do additional work to represent the new law in the language, but we don't have to change the language itself and therefore, we don't have to change the audit algorithm and the supporting audit technology.
ITworld: Are you looking at commercial applications for this technology?
Anupam Datta: One domain on which we have focused over the last few years with the healthcare privacy domain. As I mentioned earlier, healthcare organizations are using various forms of audit technology already, but much of that audit technology is focused on detecting inappropriate acts of the inside hospital.
We don't know of commercial tools yet that can help determine violations of disclosure, violations of healthcare laws when information is released by the hospital to external third parties. That's where this kind of technology can be used. This kind of audit technology could help hospitals ensure that the disclosures they're making to third parties are in fact compliant with the law. So that's where I see one concrete commercial application.
Another application is to flag to employees ahead of time, places where they might potentially violate the law. So suppose that someone in a hospital is trying to decide whether they should share this protected health information with a third party, like let's say the patient themselves. They don't know the details of HIPAA. HIPAA is a very complicated law that has many different clauses, about 85 different clauses. When they're trying to do that they could potentially query our system and ask " I'm trying to disclose this type of information to this third party. Is this permitted by HIPAA?" The algorithm will come back and say, "Yes, it's permitted by this clause in HIPAA so long as these conditions hold." Like sharing it for the purpose of treatment or something like that.
Therefore, it can actively help with training. If you look around the web, you see that much of the training for compliance with laws like HIPAA is done through slide decks that summarize the law and things like that. So that form of training is not as useful as having an active tutor, if you will, an online tutor, if you will, whom you can query to ask, "Well I'm trying to do this. Is this permitted by the law?" Having a computer program that can tell you that can be hugely helpful.
In addition to audits this would help with creating -- many, many violations eventually could happen because people don't know what the law is saying because the law tends to be very complicated. So that's another application of this kind of technology.
There was also a recent discussion led by Health and Human Services on disclosures accounting, where the basic idea was that a patient should be able to go to the hospital and ask them to give them a list of all the disclosures that were made by the hospital pertaining to that particular patient of information. So the question is what kind of tool could help a hospital produce this kind of disclosure list.
Again, the underlying technology that we have could help with disclosure accounting in a much more meaningful way. Not only can the hospital produce a list of all the disclosures that were made pertaining to a specific patient's health information, but it can also say something about which clause of HIPAA actually permitted that disclosure or under what assumption this disclosure was permitted under the law.
So I would say that there are three potential applications of this technology. One is the audit of information disclosures to third parties by hospitals. Two is online tutoring, online training for employees in helping them understand what disclosures are permitted by the law and what are not. Three is providing disclosure accounting to patients.
ITworld: How do you get a technology like this used universally as it would need to be, to really have a big impact?
Anupam Datta: Very good question. Number one, the first impediment that'd have to be overcome is making the tools more usable and more accessible to administrators inside hospitals. Right now I think the language - being based on first order logic- is very general. However, having a front end for it that might be similar to languages like XACML, which are accepted as industry standards, would be one key step that we will have to take.
Number two, is a question of interoperability that you raised, which is also huge and very, very important. Let's say the hospital had two systems. One for in-patient and one from outpatient electronic health records, then the corresponding audit log might be in a different format. So they will have to somehow change those two different formats of the audit log and put them in some common format, which then the audit tool can use as input. So that's another challenge.
I agree. It's a big impediment, but it could have been much worse had it not been an audit technology. So those are two big challenges, but for both of those challenges there is hope, and there is some evidence from prior work that there is a clear path forward.
ITworld: Thank you very much.