A July 13, 2018 article in The Guardian: "Data is a fingerprint: why you aren't as anonymous as you think online" showed why even "So-called ‘anonymous’ data can be easily used to identify everything from our medical records to purchase histories".
In August 2016, the Australian government released an “anonymized” data set comprising the medical billing records, including every prescription and surgery, of 2.9 million people.
Names and other identifying features were removed from the records in an effort to protect individuals’ privacy, but a research team from the University of Melbourne soon discovered that it was simple to re-identify people, and learn about their entire medical history without their consent, by comparing the dataset to other publicly available information, such as reports of celebrities having babies or athletes having surgeries.
The government pulled the data from its website, but not before it had been downloaded 1,500 times.
This privacy nightmare is one of many examples of seemingly innocuous, “de-identified” pieces of information being reverse-engineered to expose people’s identities. And it’s only getting worse as people spend more of their lives online, sprinkling digital breadcrumbs that can be traced back to them to violate their privacy in ways they never expected.
--
Data sets about people are seldom truly anonymous because we each do things that are uniquely identifiable within the data set, especially when combined with information obtained from additional / external data sources.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.