In-Depth: Consumer health and data privacy issues beyond HIPAA

By Brian Dolan
Share

By Brian Dolan

Julie Brill 498

"The issue of consumer generated health data is on that is near and dear to my heart," Federal Trade Commission Commissioner Julie Brill told attendees at an event focused on the protection of such health data earlier this month. "...Big picture, consumer generated health information is proliferating, not just on the web but also through connected devices and the internet of things." As Brill noted this kind of health data are "health data flows that are occurring outside of HIPAA and outside of any medical context, and therefore outside of any regulatory regime that focuses specifically on health information." That's why it falls to the FTC to oversee privacy-related concerns for consumer generated health data. 

"I was at the Consumer Electronics Show in January and was really wowed by much that I saw," Brill said. "Some of the devices that I saw were particularly focused on health and the quantified life. One in particular that struck me was [Rest Device's] Mimo, a onesie developed to measure the heart beats, respiration rate, and other vital signs of an infant or newborn. It could send information to an app, to the parent's mobile device. Think about the benefits to any parent who is worried about SIDS or might want to get their baby to sleep better or get themselves to sleep better."

Brill also noted the rise of step counting devices, the trend of some physicians Googling their patients, and what she called an ongoing ethical debate about physicians "friending" their patients on Facebook. "There is also the now infamous example of companies that are generating their own health data about their customers with respect to their purchases, like Target did with its pregnancy predictor score," she recalled.

As Nicholas Terry, Professor of Law & Co‐Director of the Hall Center for Law and Health, Indiana University Robert H. McKinney School of Law MIMO onesie

"At root such patient curation of health data bespeaks autonomy and is symbolic of patient ownership of the data," Terry writes. "However, it fails to take into account one practical limitation—the canonical version of the record will remain in the provider’s control–and one legal limitation—that only the provider-‐curated copy is protected by HIPAA-HITECH. In contrast, the patient‐curated 'copy' attracts little meaningful privacy protection. Well‐meaning privacy advocates should think carefully before promoting this autonomy‐friendly 'control' model until data protection laws (not to mention patient education as to good data practices) catch up with patient curated data."

Brill said that legislation like HIPAA and HITECH show that the US believes health data is sensitive and deserving of special protection, but "then the question becomes, though, if we do have a law that protects health information but only in certain contexts, and then the same type of information or something very close to it is flowing outside of those silos that were created a long time ago, what does that mean? Are we comfortable with it? And should be we be breaking down the legal silos to better protect that same health information when it is generated elsewhere."

During the FTC event the agency seemed to take pains not to point to specific examples of wrongdoing, but the commissioner and other participants in the forum did raise up relevant examples of products for the sake of discussion.

"We recently read about one insurance company, Aetna, that has developed an app for its beneficiaries to use," Brill said during her opening remarks. "It will allow their users to set goals and track their progress on all sorts of health indicators: weight, exercise, things like that. I think it’s wonderful that Aetna set this up, it’s great, but I don’t know precisely what they are doing with this information. We’ve looked at the terms of service. It just raises interesting questions: To what extent could this information be used for rating purposes? We all know under the FCRA It ought not to be, but what are the rules of the road here?" CarePass City To City Tablet

Brill also pointed to a company called Blue Chip Marketing, which she said mines social media feeds and other databases to help recruit patients to clinical trials: It doesn't "work with doctors or hospitals, instead it surfed social media, searched cable TV subscriptions, and got lots of information that allowed it to infer whether consumers were obese, potentially had diabetes, potentially had other conditions, and then offer to them to join a clinical trial. Some consumers would think that’s great – yes, I’d like to be part of a clinical trial, but others were really shocked when they were contacted by the company or others working with them. They asked: what makes you think I’m obese? Or how did you know I was a diabetic? [This raises] really interesting issues."

In Terry's commentary to the FTC, he points to a recent paper by McKinsey called The ‘Big Data’ Revolution in Healthcare, which identified four primary data pools that were driving big data in healthcare: They are activity/claims and cost data, clinical data, pharmaceutical R&D data, and patient behavior and sentiment data. Terry also described three types of health data that big data companies use to create proxies for HIPAA-protected data: "'laundered' HIPAA data, patient-curated information and medically inflected data."

Terry also rightly recognizes the rise of quantified self tools as an important consideration in the discussion. "A similarly dichotomous result is likely as the medically quantified self develops," Terry writes. "The quantified-self movement concentrates on personal collection and curation of inputs and performance. Obviously, health, wellness and medically inflected data will likely comprise a large proportion of such data. A similar, if less formal, scenario is emerging around health and wellness apps on smartphones and connected domestic appliances such as scales and blood pressure cuffs. Smartphones are crammed with sensors for location, orientation, sound and pictures that add richness to data collection. And there is ongoing and explosive growth in the medical apps space that seeks to leverage such sensors... These processes will in most cases lead to medically inflected data that exists outside of the HIPAA-HITECH protected zone." 

During a panel discussion that followed FTC Commissioner Brill's talk, ONC's Chief Privacy Officer Joy Pritts took part in a panel discussion where she explained some basic misconceptions "laypeople" have about HIPAA. joy-pritts-200x300

"Many people... laypeople think HIPAA covers all health information," Pritts said. "They are familiar signing the notice in their doctor’s office and are also familiar with getting notices from people who aren’t covered by HIPAA saying ‘we follow HIPAA’. HIPAA actually is pretty sector specific. In this country we regulate information based on who holds the information or who generates the information. In this case HIPAA generally applies to health plans, most healthcare providers, and these things called health clearing houses, which are focused on claims data. One of the interesting things about HIPAA that a lot of people don’t realize, is that it really generated from a movement to standardized claims data. It wasn’t really about privacy originally, privacy was included as a protection, but the focus was on simplifying the administration of health claims and how they were processed. Once you know that, a lot of what happens under HIPAA makes a whole lot more sense."

It's no secret that the US federal government is actively encouraging patients to move their sensitive health data out of the HIPAA-protected zone. As healthcare becomes more patient-centric, it's crucial for patients to have access to their own health records. The HITECH legislation clarified that patients are entitled not just to a copy of their records but to an electronic record where available.

"We are encouraging people to move their information potentially out of the HIPAA-covered bubble and out into the hands of others who may not be subjected to HIPAA," Pritts explained. "There are circumstances when you have a personal health record that is offered on behalf of a health plan or provider, because they are so tied to a plan or provider, that information would remain protected within HIPAA. If you transfer information to your own chosen personal health record website, then it wouldn’t be protected. So, it is a little complicated."

While it didn't start out as a part of a medical record, consumer-generated health information can still be sensitive information to some patients in some cases. DSC_86961-250x250

During the FTC event Joseph Lorenzo Hall, Chief Technologist at the Center for Democracy & Technology, mentioned some findings from a colleague's recent study about Fitbit users. The study focused on the top concerns Fitbit users had about the privacy of their Fitbit data.

"To a certain extent this is pretty benign information: How many steps you’ve walked and actual motions translated into how far you walked," Hall said. "The things that people cared about, the top three were embarrassment, physical safety, and then implications for employment and insurability. In terms of embarrassment, Fitbit has a great case study itself where they were actually sharing individual sexual activity publicly online without them knowing. You don’t typically wear your Fitbit for that activity but you can self report that activity, and if you are sharing everything, you are sharing that as well. That was very embarrassing for some users, and Fitbit recognized, to their credit, that some categories of physical exertion may be a little more sensitive than others. Physical safety is another thing. If you talk about routes – running routes – you can begin to predict when someone might be alone or when they might not be at home. Finally, when it comes to employment and insurability, there are concerns about insurance rating. [However,] these kinds of consumer generated health data are increasingly being used in wellness programs to reward people and encourage them to be healthy for the bottomline, if not just for your health insurance premiums, but other things as well in terms of making it a better working environment."

Pritts said that many consumers appear willing to trade their personal information and perhaps some health data for free access to health apps and services, but she worries many don't realize what some third party companies might end up doing with that data.

"One of the areas that I think is kind of interesting is that people might say, 'I’m willing to give you my information to get this product for free'," Pritts said. "They might not realize what some people or some organizations do with the information after they receive it. There is a certain amount of a lack of transparency around what happens with the information after it is connected by the first third party. Many of those third parties go ahead and resell the data to other entities... There are data aggregators who are in this business of collecting information not about your health, or organizations covered by HIPAA, but these outside players where people probably don’t have a good idea of what is happening with their information." Fitbit

Hall pointed to a handful of codes of conduct that industry groups and government entities have drafted for app developers, which include some voluntary best practices for health apps, too. Hall was part of the coalition that wrote the NTIA's voluntary code of conduct to promote transparency in mobile apps after prompting from the Department of Commerce. That document says mobile app developers should disclose any collection of biometrics, health, medical or therapy information, Hall said. Some of the leading digital advertising industry groups have also created voluntary guidelines for ad-supported apps that are looking to use sensitive health information for behavioral advertising.

"One of the double-edged swords of this is that people can do some really irresponsible stuff with their data now, but that’s part of this sort of national negotiation process we are having with increased custodies on the patient’s side of being able to use and do things with this data," Hall said. "...There is a larger trend of everyone needs to bone up on their digital hygiene and understand … these tools. There are a whole slew of [new practices] that as a society we are going to have to learn to incorporate into the fabric of how we do things."

ONC's Pritts said that with the rise of social media and prolific sharing, it is often said that, from a consumer perspective anyway, privacy is dead and no one cares about it anymore, but recent studies have questioned the notion that people don't care about privacy, she said.

"People who have had something happen to them or know somebody that has... have a renewed respect for their own privacy and how their information may be used," Pritts said. "There is a segment of people who care a lot of about their privacy and there are people who would share anything with anybody. Again, sometimes those perspectives change when they learn what the consequences of that sharing may be. People flow into and out of [caring about privacy]. It is a very dynamic issue." 

Few consumer-generated health data privacy incidents so far

By Aditi Pai

Evidon third party dataOutside of people stealing digital devices with sensitive information, there aren't too many instances in which patient data was compromised or breached. In the few times that a company has accidentally leaked information, or information was found through other means has made it clear that patient data, though deidentified, could still be traced back to consumers that share this information with third parties.

In July of 2013, Privacy Rights Clearinghouse released a study funded by the California Consumer Protection Foundation addressing the privacy risks of mobile health and fitness apps. The study analyzed privacy policies from December 2012 to June 2013 for 43 apps from the top 200 lists of Apple and Android app stores; 23 were free and 20 were paid. The health and fitness apps in the report included mood apps, diabetes management apps, prescription medication shopping apps and pregnancy apps. After studying where the data within apps went, the Clearinghouse found 39 percent of free apps and 30 percent of paid apps sent information to someone not disclosed by the developer in the app or the app’s privacy policy. Further, only 13 percent of free apps and 10 percent of paid apps encrypted all data connections between the app and developer’s website.

Then, in September 2013, NoomWeight

Most recently, the FTC also did a small study of the relationship between third parties and these health and fitness apps. For this study, the questions the FTC asked were 'who are these third parties?', 'what kind of information are these third parties receiving about our bodies', and 'does the picture actually look different if we include wearables?' "We looked at 12 health and fitness apps on one operating system," Jared Ho, a lawyer in FTC's mobile technology unit said in an FTC seminar this month. "We tried to take a broad range of apps that gathered a variety of metrics about our bodies. This project was meant to be a small snapshot in time."

Ho looked at two daily activity apps connected to wearables, two exercise apps, two dietary and meal apps, three symptom checker apps, one pregnancy app, one diabetes app and one smoking cessation app. All apps were free. If an app asked them for permission to access a certain feature or sync with another app, he always opted in. Then he mapped out the data sets to see the type of information being transmitted from each app and to see where this information was going.

The 12 apps that he looked at transmitted information to 76 third parties. Data that was transmitted included device information, device specific identifiers, third party specific identifiers, for example a cookie string, and consumer data such as dietary and workout habits.

There have been few instances in which this information was exposed to more than just third parties, although in July 2011, Fitbit users realized that with a simple Google search, people would know about health metrics that they added to their Fitbit apps, including their sexual activity.

Fitbit CEO James Park wrote a post on the company's blog a few days after this breach was discovered. In it he explained that "all activity records on Fitbit.com were hidden from view from both other users and search engines, no matter what the user’s current privacy setting". The company also changed the default setting from public to private.

Latanya Sweeney"We are dedicated to making this the best fitness platform possible with users in full control of their data," Park said in the post. "For many people, sharing information is an important motivator for them to achieve their fitness goals. We will be in touch with our users about new choices they will have when they want to share information."

Even when patient data is deidentified, though, someone might still be able to trace the data back to the patients.

US Federal Trade Commission (FTC) CTO Latanya Sweeney explored this potential risk in two experiments she conducted in which she looked to see if there were ways to deidentify health data.

In the first, she looked at discharge data, which she explained are "statewide collections of patient health information collected in almost every state, usually under state mandates. Hospitals must forward information about diagnoses, treatments and payments for each hospital visit, and in some states, physicians report this information for each office visit. The state, in turn, may share or sell versions of the data."

This data can be helpful for states to compare metrics on physician performance, outcomes, and hospitalizations, which in turn could help move legislation on safety or health issues. A majority of states, 33, sell or share de-identified data, and although this data doesn't need to comply HIPAA regulations, only three states de-identify it in a way that would satisfy HIPAA requirements.

"So one of the interesting loops we found was this loop to financial companies, so the data goes from you to the physician, and from the physician to the discharge data, and then to a bank," Sweeney said in an FTC seminar this month. "So we looked at the literature and many years ago there was this article in the New England Journal of Medicine that described a banker who cross de-identified health data about cancer patients in an attempt to figure out if any of them had mortgages or loans at their bank and then began tweaking people's credit worthiness. Now I have no idea if that's true, but if we could show that it's possible by asking the question how de-identified is this data? Is it sufficiently de-identified?" Personal Genome Venn Diagram

Considering those questions, Sweeney began the experiment. After purchasing hospital discharge data from a state for $50, Sweeney found the data included almost all hospitalizations occurring in that state that year, patient demographics, diagnoses, procedures, attending physician details, hospital, a summary of charges, drug and alcohol use, sexually transmitted diseases, and how the bill was paid. While the names and addresses of patients were taken out of the data, Sweeney conducted a search of local newspaper stories printed in the same year for the word “hospitalized”, and the stories she found contained the patient's sex, hospital, admit month and diagnoses. That information combined with public records yielded not only the patient’s name, but also residential information and the reason for the hospitalization.

Using just these three sources, Sweeney could match medical records in the state database for 35 of the 81 sample cases (or 43 percent) found in 2011. Sweeney added that some matches included high profile cases, such as politicians, professional athletes, and successful businesspeople.

This experiment generalizes beyond news stories," Sweeney said in a blog post. "The kind of information appearing in the newspaper articles is the same kind of information an employer may know about employees who are absent from work for medical reasons and a banker may know about debtors who give medical reasons as a basis for late payments."

Since Sweeney conducted this experiment, some state agencies, including California's Office of Statewide Health Planning and Development, have worked to better protect patient data. In a more revealing test and with less source material, Sweeney managed to link names and contact information to publicly available profiles in the Personal Genome Project, which is a long term study that aims to sequence and publicize the complete genomes and medical records of 100,000 volunteers to facilitate research into personal genomics.

Profiles of volunteers in the Personal Genome Project include DNA information, behavioral traits, medial conditions, physical characteristics, and environmental factors, medications, and demographic information, such as date of birth, gender, and postal code.

Sweeney took 1130 public profiles from the website in September 2011, and close to half of the profiles, 579 of 1,130, had date of birth, gender and 5-digit zipcode. Sweeney then matched these profiles with public records and voter registrations, but only looked for unique matches, which meant only one person was identified. Voter information yielded 130 unique matches, 22 percent of the data set. Public records yielded 156 unique matches, or 27 percent of the data set. Together, Sweeney found 241 unique matches, 42 percent of the data set. When Sweeney submitted this to the PGP, they found 84 precent of the matches were completely accurate, and 97 percent were accurate if they considered nicknames.

"These experiments demonstrate how PGP profiles are vulnerable to re-identification. What’s the potential harm? Many participants reveal more than DNA, including seemingly sensitive conditions -- abortions, sexual abuse, illegal drug use, alcoholism, clinical depression and more," Sweeney and her team write in the report. "Perhaps more alarming are potential economic harms a participant may face. Here is an example. Suppose a hypothetical participant named Bob has a predisposition to a gene-based disease related to his genetic profile online. He applies for life insurance. If Bob is aware of the predisposition and discloses the information, he may be denied coverage or asked to pay a much higher premium. If he does not disclose knowledge of the predisposition or if he is not aware of the predisposition, the insurance company may fail to pay the claim upon his death." 

How can health app developers support patient privacy?

By Jonah Comstock

Deborah Peel, Patient Privacy RightsDeborah Peel, chair of the board of directors at Patient Privacy Rights, doesn't see too much of a difference between HIPAA-protected patient information and the consumer-generated health information that comes from fitness and wellness apps and devices. That's because HIPAA protection doesn't "follow the data" -- it governs covered entities and their business partners. Still, there are plenty of ways in the law for EHR vendors, pharmacies and others to sell aggregated data.

"Virtually all our health data is not only inside the healthcare system, it’s outside the healthcare system," Peel told MobiHealthNews. "Because there are so many loopholes in HIPAA and such a lack of clarity of understanding of patients’ rights, all of our data is virtually everywhere. Everyone believes — most doctors, most health professionals — that HIPAA protects privacy, but it doesn’t. Even inside the healthcare system we don’t have much control over our health information at all."

When it comes to fitness and wellness apps, the situation is very similar, she said. Most apps are making their money selling health data to advertisers, because aggregated health data is very valuable.

"Patient Privacy Rights has been working on this for 10 years, but in the meantime what’s happened is protected health information has become the most valuable information in the digital age," she said. "It’s like oil. It’s a very, very valuable commodity. It’s far more valuable than your credit card, social security number, any of that."

Peel doesn't think the data is being put toward any productive uses at all, but rather toward targeted advertising, and new healthcare treatments that are more about making money than improving patient care. "The research being done with our data is primarily corporate analytic research to improve the bottom line, and that’s not the same thing as helping sick people," she said.

"I already pay the pharmacy to fill the prescription," she said, referring to the practice of pharmacies selling their prescription records to groups like IMS Health. "I’m already paying them, they’re already making a profit off of me. Why do they get to make another hidden profit by selling my prescription records? If anyone’s going to sell their prescription data, shouldn’t it be Grandma? Then maybe she’d be able to afford her drugs."

From Peel's perspective, a lack of patient privacy is a harm in and of itself, especially because numerous polls suggest that patients overwhelmingly want to restrict access to their own data, even for research purposes. But there are other, real harms beyond that, including that a lack of trust in the healthcare system leads to worse care.

"The worst case scenario isn’t just being marketed expensive products," she said. "This is the worst case scenario. People hate to have their privacy violated. So today 1 in 8 people lie and omit what’s really going on with their doctor. The outcome of that is bad data, which actually jeopardizes their health ... If the goal is better health, we’ve got to have control of our information for people to go and be well. Privacy is essential for healing. This is what Hippocrates figured out [thousands] of years ago."

That's less of a concern in the realm of fitness and wellness apps, of course. With those, Peel sees discrimination based on private information as one of the biggest risks -- for jobs or for insurance, for example. 

JulesPolonetskyJules Polonetsky, the Executive Director of the Future of Privacy Forum, an industry-supported Washington think tank on privacy, thinks the actual harms are less than the hype would indicate. However, he does agree that, even in health and fitness apps, trust is a big problem -- even when information gets out by accident rather than being sold.

"There are risks of people being particularly embarrassed or particularly harmed by inadvertent disclosure," he said. "From the early example of Fitbit and people’s sexual activity, to much more intimate data that increasingly is going to be available. There’s obviously true harms, and then there’s a whole range of areas that may not be financially harmful but need to be treated with extreme scrutiny, so we don’t freak people out, we don’t make people feel strained or constrained. I need to trust the technology that interacts with me intimately."

Polonetsky doesn't see patient data being bought and sold as a problem in and of itself, but thinks the industry should have better standards on what information is fair game.

"During my years in the private sector I was the chief privacy officer at [online advertising company] Doubleclick for a number of years. I struggled mightily to tell the ad business 'No, we’re not going to create a clickstream protocol of a guy with erectile disfunction issues.' They said 'Why not? Is it against the law?' Well, no. 'Is there some list of things that are sensitive?' Well, hmm, let’s see. Is it things that need prescriptions? Well, that’s not a logical place to draw the line. Is it things that some people would be embarrassed about and not others? Well, who’s job is that? In my early days at Doubleclick, nobody did specific health when it came to advertising and marketing profiles. Google today still has limits on whether you can buy a profile on someone who’s been browsing a health issue, but pretty much everyone else does it, and it’s not clear where the line is."

He says the lack of clarity has created an industry where any information is fair game.

"When there isn’t clarity about exactly what’s OK and what’s not, you have a rush to the bottom," he said. "And today you can buy the most sensitive information And it’s fair to say there’s people whose hair would stand up if I said to them 'You know what? I can target you by device IDs linked to your phone and no, no one knows your name, but we’ll know that this phone belongs to someone who is looking for erectile disfunction health products.”

The other problem, he said, is that today's apps and fitness trackers share so much information with each other, through APIs, that the consumer has to sift through a huge number of privacy policies to understand what they're getting into. And if even one is misusing their data, it's all out there.

When data is sold, it's aggregated or de-identified, but Peel says a growing body of evidence supports the assertion that deidentifciation is a small comfort if any -- it's too easy to re-identify.

"It’s virtually impossible to keep the information private just by taking the name off," she said. "It’s very easy to re-identify people. In fact, what [some data aggregators do] is they aggregate more and more data about you every day. For them to keep putting more infomation in their file about me, it’s really not deidentified or anonymized. Not so long ago, everyone had high hopes that deidentified data would really be a way to protect information. But there’s too many data fields that match with public data fields and things that can easily be re-identified."

Polonetsky says his group is working on improving deidentification outside of HIPAA, but it's difficult, because you risk losing the very information advertisers and health researchers are most interested in.

"Where do you draw a line and say 'This is good enough'?" he said. "'It may not be rocket science, this is reasonably good given the legal promises and contractual promises and the way I’ve minimized the risk. I can release this data confidentially to a researcher, I can give it to an advertiser or a marketer because I’ve done enough to deidentify it.' Do I have to use anonymity and differential privacy and all layers of sophisticated aggregation? I might lose the utility if I do that. I might need to know if people from different races or different regions are the ones who benefitted from additional exercise and therefore it reduced this kind of disease."

Both Peel and Polonetsky think the solution needs to come from the app makers themselves, although Polonetsky thinks pressure from the government will motivate them, while Peel thinks it will be an outcry from the public.

"Are you going to fool the public?" she said. "Maybe you’ll make a lot of money for a while. But I’m a total optimist, and I don’t think you’re going to fool all of the people all of the time. I think the public is going to discover this massive hidden industry and they’re going to say 'That’s my health data, I want to control it.' I think it’s going to stop, and that’s what we’re working for."

She thinks app makers will come around and change their business models to one where the patient is the customer.

"I might use an app to improve my own health. I don’t want to use an app until I know that’s the case and [my data's] not being sold," she said. "If it’s being shared or disclosed, it’s because I want it to be shared or disclosed -- that’s not the business model of the app. And I think there are a lot of do-gooder apps out there but again, profit model doesn’t have to coincide with hurting patients. Most businesses they sell products and they get us to buy them because it’s a good product."

On the other hand, Polonetsky thinks the solution (in addition to making certain sensitive data off-limits) is for apps to be more forthright in their data collection and the uses of it. And not just by having an easier to read privacy policy, but by moving that information out of the privacy policy altogether, and into the light.

"What app providers and what companies need to do is recognize that your privacy policy is not where you talk to people about the data," he said. "That’s your legal disclosure, you gotta have it, you gotta get it right. But if data is a feature, which it is for these devices, you need to have a user interface that is descriptive and that helps people understand what’s going on. You need to featurize the data use. Don’t look at it as a notice or a disclosure. It’s a core part of your product, and that’s good. That’s not an embarrassing thing. ... We’re collecting your health data, because that’s what we’re doing. Here's how I use this data to improve the product, and here's how I use it to make money because, hey, that’s the deal."

Tubular Doom Sock