Understanding Well-being Data

chapter 5 Getting a sense of Big Data and well-being

Linking Big Datasets – for well-being?

On New Year’s Day, 2020, a Canadian health monitoring company alerted its customers to the COVID-19 outbreak, some days before the US’ CDC or the World Health Organization (WHO) alerted anyone1. Of course, the disease was not yet called COVID-19, and it was not known that it was to be a global pandemic. At this point, a cluster of unusual pneumonia cases had been detected. One of the companies said to have beaten the WHO to this discovery is called BlueDot, which uses AI-driven algorithm searches to look at datasets, much like GFT.

Unlike Google Flu Trends, BlueDot’s algorithms consolidate and analyse data from numerous sources. BlueDot’s owner, Dr. Kamran Khan explains:

We can pick up news of possible outbreaks, little murmurs or forums or blogs of indications of some kind of unusual events going on.

(Khan, in Niiler 2020)

Other data sources are more official, such as statements from health organisations, livestock and news reports in 65 languages. BlueDot also uses ‘anonymous mobile phone data’2, flight sales and other records. These various data points enable a prediction of a possible new serious disease. Importantly, the logic is that this approach also offers insight into how that disease becomes mobile by the people who carry it and the planes who carry the people carrying the disease.

What we have done is use natural language processing and machine learning to train this engine to recognize whether this is an outbreak of anthrax in Mongolia versus a reunion of the heavy metal band Anthrax.

(Niiler 2020)

Also, crucially, ‘epidemiologists check that the conclusions make sense from a scientific standpoint’3. The company website states that ‘BlueDot protects people around the world from infectious diseases with human and artificial intelligence’4. Therefore, despite claims to its sophistication, the automated data-sifting still requires human analysis to make sense of what has been found.

Khan’s company utilised technological developments at its disposal to synthesise many different types of data from multiple datasets to construct evidence. Only when the data were pieced together was the information useful, and only after human experts had checked it, were these insights deemed useful enough to share and use. BlueDot is a commercial company. The human and artificial intelligence are synthesised as an enterprise, and Khan is often presented as both an entrepreneur, as well as a professor of medicine and public health at the University of Toronto. Khan has also worked in hospitals, so understands how they work. Khan explains in one inter view,

Disease doesn’t wait for the reviewers, so we need a more agile system. My motivation for creating a company—here to start supporting an entrepreneurial spirit—using business as the vehicle to do that.

(Khan, on Charrington 20 February 2020)

There are two things to note here. Khan suggests that the old structures of peer review and scientific expertise are too slow in their use of data and evidence to tackle a global pandemic. He also suggests that his business successfully links together ‘human and artificial intelligence’ to provide what traditional science cannot: the analysis of data with veracity and variability, speed, resolution, relationality and so on. The value of BlueDot is in its claims to harnessing the qualities of Big Data.

To return to Mayer-Schönberger and Cukier, ‘Google’s method’ may not have involved distributing mouth swabs, or been built on old infrastructures, but instead, they explain:

[I]t is built on “big data”—the ability of society to harness information in novel ways to produce useful insights or goods and services of significant value.

(Mayer-Schönberger and Cukier, 2)

So, there we have those familiar terms of insights (a marketing term) and valuation (that we discovered from economics in Chap. 2), alongside clear communications and the presentation of novelty (Chap. 4), goods and services. Mayer-Schönberger and Cukier hint at the complex politics at play on the value of data—and the values of data more broadly than we have already encountered.

Crucially, in a book about well-being and data, we have to note that BlueDot’s business is entrepreneurial because it is profitable. In other words, the insights have to be sold to clients and customers. They were also not the only innovator (as acknowledged by the Lancet and MIT Review5). Here, we must return to the economic value of data because of the possibilities of well-being insights and the ideological project of the well-being agenda.

If the well-being agenda is about improving redistribution of resources as an issue of social justice, we might want to think about what position we are coming from: rather than asking, ‘what are the data limits of these well-being projects?’, we might ask, ‘what are the well-being limits of data projects like these?’ Although, despite the clear sophistication of BlueDot’s project, it also did not prevent COVID-19’s spread. This criticism has been noted in the MIT Review:

The hype outstrips the reality. In fact, the narrative that has appeared in many news reports and breathless press releases—that AI is a powerful new weapon against diseases—is only partly true and risks becoming counterproductive.

(Heaven 2020)

The point this MIT article was making here is that the over-reaching claims of AI could be damaging to its future progression, in the same way that GFT overstretched its claims.

Data and the distribution of resources are very much part of the COVID-19 story, and not just of private companies profiteering, either.

Such competition is also reiterated by national politicians misleading the public about ‘world-beating’ systems of data6. In the same way that the social indicators movement was halted because it was not quite measuring what it thought it was measuring (Chap. 2), the ‘promise’ of Big Data has adjusted. The limits of Google’s approach are in a lack of context: the nature of what people actually search for is different than was predicted. The limits on data are social, cultural, political and economic, and by extension, these limit the possibilities for a good society. We will explore social media and mobile communications data in the final few sections to better appreciate this relationship.

  1. Niiler 2020 []
  2. Whitaker 2020 []
  3. Niiler 2020 []
  4. BlueDot n.d. []
  5. [McCall 2020; Heaven 2020] []
  6. BBC 2020 []