What happens when data are inaccurate, incomplete our outdated?
In other words, what happens when Big Data becomes Bad Data?
Share your story!
#BadData wants to collect personal stories and reports about the social, ethical or economic consequences of bad and misused data.
Send your Bad Data experience.
Data & Algorithms
Thanks to artificial intelligence, algorithms can be trained by and learn from data. But what an algorithm does depends a lot on how good the data is. Data could be corrupted, out of date, useless or illegal. In this way, bad data plays an important part of all kinds of decision-making processes and outcomes.
From banking to health to social services or education, bad data can have an important impact on our most fundamental rights. At the bottom of this page you’ll find a reading list filled with concrete examples of how bad data is already impacting society.
Bad Data prominent cases
New Coke (incomplete data)
Facing competition from the sweeter-tasting Pepsi Cola in the mid-1980s, Coca-Cola tested a new formula on 200,000 subjects.
Cuts to health care based on algorithmic assessment (corrupted)
Similar to the case of disability payments in the UK, in the United States there have been a number of cases where radical readjustments were made to home care received by people with a broad range of illnesses and disabilities, after algorithmic assessment was introduced.
Disability payments in the UK (corrupted/biased data)
Starting in 2016, the number of appeals against decisions made by the Department of Work and Pensions on the basis of assessments made by the private, profit driven contractors working on its behalf began to increase dramatically.
Facebook and Cambridge Analytica (illegal/leaked)
Cambridge Analytica, a private company, was able to harvest 50 million Facebook profiles and use them to build a powerful software program to predict and influence election choices.
Incomplet or inaccurate personal data (corrupted, out-of-date, useless)
Deloitte Analytics conducted a survey testing how accurate commercial data used for marketing, research and product management is likely to be.
CDC used bad data to judge DC water safety (incomplete)
In 2000, a problem with Washington, DC’s drinking water began when officials switched the disinfectant they used to purify the water. The switch was supposed to make the water cleaner. But the change also increased corrosion from the city’s lead pipes, upping the amount of lead in the water.
Predictive policing (biased, incomplete)
Police are increasingly using predictive software. This is particularly challenging because it is actually quite difficult to identify bias in criminal justice prediction models. This is partly because police data aren’t collected uniformly, and partly because the data police track reflect longstanding institutional biases along income, race, and gender lines.
Google’s Flu Trend (incompatible)
Launched in 2008 in the hopes of using information about people’s online searches to spot disease outbreaks, Google’s Flu Trend would monitor users’ searches and identify locations where many people were researching various flu symptoms. In those places, the program would alert public health authorities that more people were about to come down with the flu.
In 2018 Eticas started a #BadDataChallenge and the results were overwhelming because by simply asking themselves, people realized that at least once in their life they confronted a BadData. Some ended up paying for the health insurance of another person, others had wrong names of the parents in their ID, which complicated their identification process constantly.