chapter 5 Getting a sense of Big Data and well-being
Understanding where people are and how they feel using Twitter data
Of course, it is not only what people say that can be mined, but also where they are. One research project attempted to gauge community well-being using Twitter data from between 27 September and 10 December 2010 . Interestingly, as an aside, this coincided with the UK’s Measuring National Well-being debate which launched in November of that year. The researchers were interested in a few things. They wanted to understand more than individuals, to measure the well-being of communities. They state their intention as moving the recent developments in subjective well-being measures that we discovered in the last chapter forward. Rather than administering questionnaires on an individual basis, or in a national-level survey, they wanted to explore the recent possibilities of sentiment analysis to understand community well-being,
Social media data can significantly reduce the time-consuming processes that make large-scale surveys and qualitative work resource-heavy. Once these data have been ‘scraped’ and saved into a database, they can be analysed in many ways. In the case of Querica and their co-authors, they were interested in the idea of using sentiment analysis to see if it could interpret community well-being. They created a sentiment metric, which was originally derived from studying Facebook status updates . This metric standardised the difference between the percentage of positive and negative words in a Facebook user’s posts in one day. Kramer used the metric to make arguments at a national level, aiming to develop, as he suggests in the title of his paper, ‘An Unobtrusive Behavioral Model of “Gross National Happiness”’.
His new standardised metric was found to correlate with self-reported life satisfaction. Looking at the US specifically, peaks were found in life satisfaction that correlated with national and cultural holidays. This is fine in and of itself, but what does that tell us about well-being? Christmas is good for well-being? Other research indicates otherwise , suggesting it can cause feelings of stress for various reasons: financial, family, and so on. What about the days either side when people are travelling huge distances (with everyone else) using transport infrastructure which is not fit for purpose? Or the excesses of consumption that holidays like Christmas involve, as well as their impact on the planet? What about all those who do not celebrate Christmas, as they are not of a Christian denomination? In his limitations, the author acknowledges that there is a possibility that the likelihood to wish people ‘Happy Christmas’ could have affected these results. However, he decided not to control for this, as wishing someone happy holidays is a positive sentiment. We might wonder, then, whether this study was really interested in the possibilities for understanding the human experience using the details of the Facebook posts, or whether it was interested in deriving a metric that was comparable with more established methods.
Returning to the study on community well-being, the authors state, ‘it is not clear whether the correspondence between sentiment of self-reported text and well-being would hold at community level, that is, whether sentiment expressed by community residents on social media reflects community socio-economic well-being’ . Therefore, they do note some of the limitations of using this approach to answer their research question. However, notably, they do not acknowledge some of the limitations of the metric itself.
London was chosen for the study to understand about communities, socio-economics and well-being. Let’s break down what they did and how. The study used four types of data gathering, it:
- ‘Crawled’ Twitter accounts whose user-specified locations report London neighbourhoods.
- Geo-referenced the Twitter accounts by converting their locations into longitude—latitude.
- Measured socio-economic prosperity, using the UK’s IMD.
- Conducted sentiment analysis on tweets between particular dates from their sample.
How did these processes work?
1. How the crawl worked: the researchers chose three popular London-based profiles of news outlets: the free newspaper The Metro, which was available in London on the Tube at the time (it has since expanded), a right-wing tabloid The Sun and the centre-left newspaper The Independent. These media were chosen because they are thought to capture different demographics of class and politics. Using these three accounts as ‘seeds’, they used ‘a crawler’ to trace linked accounts. Crawlers are software that allows you to gather various kinds of available data based on who interacts with a particular website or Twitter account. In this instance, every user following these accounts was ‘crawled’.
2. Some Twitter users stated where they live in their profiles. Accounts were crawled to find 157k of 250k profiles had listed locations, with 1323 accounts specified London neighbourhoods. They then filtered out likely bots by also ‘crawling’ using another metric for each profile. This brought the sample down to 573 profiles. Once these were established, locations were converted into longitude-latitude pairs, translating these data into geographical co-ordinates which are easier to work with.
3. The IMD is broken into 32,482 areas, 78 of these are within the boundaries of London used by the authors (these are not necessarily fixed). The IMD offered a score for each of London’s 78 census areas. The authors use a census area to represent ‘a community’. We shall return to this key point in a bit, but hold that thought. The data comes from the ONS’ Census and is an objective list of sorts: income, employment, education, health, crime, housing, and the environmental quality. It is worth noting that in the IMD, the ONS talk about ‘Lower Layer Super Output Areas’ (LSOAs), rather than communities.
4. Sentiment analysis was undertaken on the tweets using two algorithms. (1) Kramer’s metric described and (2) something called a ‘Maximum Entropy classifier’, which uses machine learning. The algorithm in Kramer’s metric has a limited dictionary, so this second machine learning package was used to improve on the first, by using a training dataset of tweets with smiley and frown-y faces. The authors argue that the results across the two algorithms correlate and are accurate. They then measured the sentiment expressed by a profile’s tweets and then compute, for each region, an aggregate sentiment measure of all the profiles in the region.
Findings: So what did they find? Through studying the relationship between sentiment and socio-economic well-being they found that ‘the higher the normalised sentiment score of a community’s tweets, the higher the community’s socio-economic well-being’. In other words, the sentiment metric accounted for positive and negative sentiments, enabling each area’s aggregated data to show an average score. This tended to correlate with the scale that they used that indicates poverty and prosperity in that locale (the IMD).
Limitations—What did the authors identify as limitations?
Demographic bias—Twitter users are certain types of people; therefore, these findings will over-represent the happiness of Twitter users— missing out on non-users.
Causality—our old friend. Though the causal direction is difficult to determine from observational data, one could repeatedly crawl Twitter over multiple time intervals, and use a cross-lag analysis to observe potentially causal relationships.
Sentiment—They tracked sentiment but not ‘what actually makes communities happy’ (Quercia et al. 2012, 968). The intention was to compare topics across communities. Their example:
given two communities, one talking about yoga and organic food, and the other talking about gangs and junk food, what can be said about their levels of social deprivation? The hope is that topical analysis will answer this kind of question and, in so doing, assist policy makers in making informed choices regarding, for example, urban planning.(Quercia et al. 2012, 968)
As evidenced with the possibilities for making an argument using the crude analysis of the Mass Observation tweets, and as suggested by the citation directly above, there is bias in the ways that Big Data can be used to inform social and cultural policy. However, this is not necessarily any more the case in these examples than in those using more traditional data sources explored earlier in the book. The ways our social worlds are ordered do not reside in the algorithms, but in the preconceptions, laziness and judgements which become reproduced through researchers and their research and through policy-makers and their decisions. While the Quercia et al. examples were presented as a binary of opposites for narrative effect, the ridiculousness of the proposition may not stop it coming into effect as a deductive study in future. The fact that gangs are unlikely to tweet about gangs is one thing. Furthermore, the idea that these gangs remain within their ONS-allocated geographical boundaries called LSOAs is also a nonsense.
This brings me to another point, LSOAs are not communities: not in the way that we think of community well-being as built on social relations and inter-related lives. People are not only active citizens where they live, and in a city like London especially, may actually be more likely to be active citizens where they work. Without the context of understanding London, what it is to live in London, and the complex, overlaid communities and social groups that comprise a postcode, this idea of community well-being is a misnomer. Instead, it matches one index that uses census data, which, while valuable, can be out of date, and is well-known for its various limitations as a metric of socio-economic deprivation or advantage.
Perhaps another way to look at a question of community well-being might be to look at people interacting in public space. Plunz et al. (2019) also used sentiment analysis with geo-located Twitter data. They were interested in finding well-being indicators associated with urban park space. Their goal was to assess if tweets generated in parks may express a more positive sentiment than tweets generated in other places in New York City. Their results suggest that tweets in Manhattan are different from other NYC boroughs. In Manhattan, people’s tweets were more positive outside of parks than inside, whereas the opposite was true outside of Manhattan. They concluded that Twitter data could still be useful for aspects of social policy, including urban design and planning. They also note that one of the limitations of geo-located Twitter data is that GPS is less accurate than sometimes accounted for. It also does not account for elevation, so you could be on the metro underneath Central Park, or indeed, stuck in traffic alongside it. It is hard to establish whether people may have gone for a walk to let off steam, or commute to work, for example.
The relationships between where we are standing or where we live and our well-being are not new, but a feature of much philosophy on the nature of subjective experience, especially since the Enlightenment (which we shall come to in the next chapter). Big Data offer new ways to test what we know about place. However, these data and devices also make assumptions about place and experience . The expectations and suppositions of what happens where, for whom and how drive these analyses with the same bias as other Big Data technologies, and we must be aware of the limitations of these data, technologies and the ideas of well-being they claim to measure. We also need to be vigilant about who holds the data and why they are analysing.
- Quercia et al. 2012[↩]
- Kramer 2010[↩]
- Holmes and Rahe 1967; Mutz 2016[↩]
- Quercia et al. 2012, 965[↩]
- IMD is the UK government’s Index of Multiple Deprivation.[↩]
- This is called the PeerIndex realness score. This score is generated using information such as whether the profile has been self-certified on the PeerIndex site and/or has been linked to Facebook or LinkedIn. ‘PeerIndex realness score is a metric that indicates the likelihood that the profile is of a real person, rather than a spambot or Twitter feed. A score above 50 means this account is of a real person, a score below 50 means it is less likely to be a real person’ (http://www.peerindex.net/help/scores).[↩]
- Wilmott 2016[↩]