Netflix’s database susceptible to deanonymization attacks

Netflix employs algorithmic processing to create profiles of users from collected metadata and to predict which content they are most likely to consume. Every time one of its 300 million users selects a series or a movie, the system gathers a host of data related to the media consumption, like clicks, pauses, and indicators that the user stopped watching, et al (Narayanan 2008). Netflix aggregates such data to fashion profiles that intuit information concerning the user.

Even though the company has established several security mechanisms and uses proxies of users’ personal data to construct their profiles, the system is quite vulnerable to deanonymization attacks. Researchers Narayanan et al. (2008) were able to break the anonymization of Netflix’s database by analyzing some proxy information provided by users. Narayanan et al.’s breakthrough evidenced a “new class of statistical deanonymization attacks against high-dimensional micro-data, such as individual preferences, recommendations, transaction records, and so on” (ibid). Using such new methods, the researchers demonstrated how sensitive information like political preferences or ethnic origin could be divulged from an anonymized database.