Data Anonymization is always more complicated than expected

For those of you who think that Data Anonymization and being GDPR-compliant is just a matter of wiping out the names and other obvious info, unfortunately it is not that easy.

There are lots of features that partially describe an individual (like where s/he studies, her/his skills and preferences, the time of day s/he starts working, a commonly used IP address, the kind of OS of her/his computer, the app s/he has install, etc.). For a data scientist and an “AI”, it is fairly easy to uniquely identify an individual from a subset of such features, even if it “looks” anonymized to the humans.

At a time were the topic of privacy is gaining attention and Microsoft apologized and says to anonymize employee performance indicator in Office 365 (https://lnkd.in/dNiJUAv), such tools/services as “Amnesia” may be of use: https://lnkd.in/eQmM7XP