The temporal dynamics of search engine queries can help uncover what worries a society, in near real-time.
It has been 52 years since Bob Kennedy’s famous speech warned us about how we mismeasure the success of a nation, by focusing only on the production of goods, rather than also paying attention to social progress in health, education, environment (watch the video below). Said differently, the GDP, the value of the economic output within a country, was unable to capture single highhandedly the success or well-being of a nation.
Why is this research important? Because our knowledge of the world affects the decision we make. If we measure the wrong thing, we have a biased worldview and we are at risk of doing the wrong thing. We become distorted and design the wrong policies. Exploring online search queries extend much further than the mere description of changes following what is often described as a technology or communication revolution. Here, we aim to reach higher and apply epidemiological thinking to measure the dynamics of the zeitgeist, how the spirit of the moment changes over time. This is important as we seek to find clarity and identify causal mechanisms against the noise produced by constantly evolving social and economical conditions. Understanding society requires studying the changes (and the speed of those changes) in the multiple stories (or narratives) that develop, evolve and interact together.
Why is our approach different? We strongly believe that understanding well-being at societal level does not require to study self-descriptive individual posts and social media feeds. We think there's enough evidence showing that social media distort human behavior and lead to the data not accurately reflecting people’s true thoughts and feelings. In contrast, internet search queries, are more likely to be immune to such biases as they are carried out in privacy, and without any need of self presentation. In other words, internet searches are one of the rare instance of private behaviors that is available to scientists. Moreover, those keyword search volumes, as they are known in the jargon, are collected in well-defined temporally- and spatially-defined fashions, making them easily suitable for further analyses using the statistical and mathematical tools common in today's Machine Learning / AI landscape.
What do we know about the subject? There is now a growing acknowledgement that Subjective Well-being has to be examined when assessing social and national progress. Government agencies and major transnational organisations (UN, OECD) have developed tremendous efforts to measure and track the evolution of Subjective Well-being. This is mainly done through periodic surveys asking people to self-report their well-being. Although this is a valid and reliable way to capture Subjective Well-being, this method has significant limitations that constrain how finely-grained the information is: It is indeed difficult or impossible (1) to collect representative information from a specific geographical area, (2) to repeat the measurement too often on the same people, (3) to track minute changes of Subjective Well-being across short periods of times (hours, days), and (4) to be more immune to socio-cognitive response biases. To overcome these limitations, we decided to focus on a promising proxy measurement of Subjective Well-being: the billions of data points generated by specific, affect-related keyword queries on Google’s search engine.
However, like all scientists, before shifting to an alternative and promising measurement, we need to ascertain its validity and its stability (what we call its psychometric characteristics).
Search Engine Queries, Issues Salience and Well-being: Research on the factors that shape public opinion have shown that individuals seek information about issues that are salient for them, and found strong associations between trends in issue awareness measured through survey and internet search volumes. Thus, search term frequency can be considered as a proxy measurement of how a population is aware of an issue. This predictive power has already been established in a range of behaviours: from the prediction of mortgage defaults and car sales, to unemployment claims and tourist visits. The role of information-seeking in motivating search-engine queries also carries significant implications for the potential to represent Subjective Well-being, and recently, geographical differences in several Google query for affect-laden terms (such as “anxiety”, “fear”, etc…), across US-counties, were strongly associated with variations in aggregate measures of physical and mental health (e.g., depression rate, cardiovascular morbidity). However, if this captures geographical variability, it does not capture changes occurring in time, or more important for us, changes taking place in the aftermaths of significant events, shocks, news.
This is where we are exploiting the potential for search engine affect-laden keyword queries to reveal how people react to complex situations in near real-time. For Google searches, the popularity (the volume) of a keyword can be examined and averaged for various periods: from each almost instantaneous 7.95-min-long periods to hourly, daily, weekly or monthly aggregates. To the best of our knowledge, we are the first team to systematically examine such high sampling-frequency search queries data. In the following example, we used AI to mine UK searches containing the keyword “anxiety”, for 3 years.
Interestingly, it's when looking at hourly averages for the search term "anxiety" that we found that queries were not equally distributed during the day and peaked around 1-PM.
We are using Big data, AI, Psychology, Cultural economics, and Linguistics to explain this phenomenon and believe it shows how humans have rapidly adapted to the availability of connected technology to seek help and information about well-being. Here we are observing how technology appeared in the last two decades has shaped new human behaviour.
This method is also powerful enough to capture a change in interest in reaction to external events, what economists call a ‘natural experiment’. For example, the daily online interest for the keyword “gambling” decreased dramatically about 10 days before the Covid-19 lock-down. This was not a fluke, and was also true, against the same periods in 2019, 2018 and 2017. We are working with major gambling-help charities to explain this complex dynamic.
This was not a fluke, and was also true, against the same periods in 2019, 2018 and 2017. We are working with major gambling-help charities to explain this complex dynamic.
Watch us presenting our research at a recent webinar in Swansea University (25 June 2020)
Comments