The Takeaway from Google's Big Data Flu Failure

In 2008, researchers from Google explored the potential in prediction using Google search data, claiming that they could “nowcast” the flu based on people’s searches. Google said it could produce accurate estimates of flu prevalence 2 weeks earlier than the CDC’s data via a program known as Google Flu Trends (GFT). However, GFT completely missed the peak of the 2013 flu season by 140%, thus the program failed spectacularly.

Part of the reason it failed was that Google did not take into account changes in search behavior over time. In addition, Google introduced its suggested search feature as well as a number of new health-based add-ons to help people find the information they needed. While these are great features for Google users, it also makes some search terms more prevalent than they actually are, ultimately misrepresenting key data for GFT.
Google’s sequel to GFT could present an ideal model for collaboration around big data to produce meaningful insights for the public good, if done correctly. Future versions of GFT should continually update the qualifications of the data to flu prevalence, because the value of the data stream would otherwise rapidly decay. Apart from GFT, there are small efforts across the globe attempting to combat disasters using big data.

The UN set up the Global Pulse initiative, which developed collaborative data repositories around the world. Flowminder is a nonprofit based in Sweden that is dedicated to gathering mobile phone data that could aid in disaster response. The question moving forward is how to strengthen these efforts while still protecting the privacy of individuals as well as the interests of big data custodians. 

To read more from Alison click here

To read more from click here