Why big data is all the buzz for statisticians


Irina Bernal, Tanja Sejersen, Ronald Jansen and Niels Ploug   | Published: April 27, 2021 21:13:04 | Updated: May 02, 2021 21:30:19


Representational image

Big data permeates many aspects of people's lives, from daily communication and interactions, to shopping and consumption and the medical treatments they receive. Big data is also transforming the way people and businesses make decisions and measure things. With an invaluable continuous flow of digital information about activities and their impact on society, the economy, and the environment, big data holds tremendous potential for official statistics. The statistical community is increasingly turning their attention to new data sources despite numerous legal, technological, and financial challenges.

As it is constantly being collected and generated, big data provides timely, frequent, and granular insights - crucial attributes in critical situations. Big data was used immediately after the earthquakes in several countries in Asia and the Pacific, such as Nepal and Papua New Guinea, to understand post-disaster population dispersion and inform humanitarian efforts. Most recently, during the Covid-19 pandemic, some countries relied on internet companies' data, such as Google's Community Mobility Index or Facebook's Population Density Maps. Other countries, such as New Zealand, turned to mobile network operators' data to understand mobility patterns following the lockdowns and other restrictive measures. These timely and detailed data informed policy and helped assess its immediate impact on society and the economy.

Big data is continuously produced, generated, and collected electronically with a considerably lower burden on respondents. It can complement traditional statistics with more granularity or, in some cases, even replace traditional data collection methods. For example, accessing scanner data to produce price statistics during Covid-19 reduced the exposure risks of price data collectors. Moreover, as the lockdown pushed shoppers towards the e-commerce platforms, online price data provided timely insights about changes in prices and consumption patterns. In addition, accessing some big data sources will be cost-saving compared to traditional surveys.  

Big data can also help to monitor the Sustainable Development Goals (SDG) indicators, especially where traditional data are missing. Whereas half of the SDG indicators have no or insufficient data to measure progress in Asia and the Pacific region, big data can address some important gaps. For example, 39 SDG indicators, mostly environment-related, could benefit from geospatial information. The Chinese Academy of Sciences has been researching the potential of Earth Observation data for 19 SDG indicators. BPS Statistics Indonesia is exploring the use of mobile phone data for four SDG indicators. And the Philippines Statistics Authority has been collaborating with civil society and has identified 79 SDG indicators that could benefit from citizen-science data.

While big data mostly remains at an experimental phase in Asia and the Pacific, some statistical offices are integrating certain new data sources into the production of official statistics. Earth Observation are among the most researched and integrated. For example, the Ministry of Statistics and Programme Implementation of India produces Environment Accounts, and Geoscience Australia developed the Dynamic Land Cover Dataset both using remote sensing. Other areas of statistics, such as price statistics or mobility and migration statistics, have a high potential for integrating scanner data and mobile phone data, even though the continuity of access to these data sources is more uncertain than for remote sensing data. 

Big data will also alter the way we produce official statistics. Statisticians cannot necessarily replicate existing methods and measurements with big data. Unlike traditional data that are extrapolated from precise samples collected at certain time intervals, big data is constantly generated in very large quantities. Big data requires new analytical methods and tools. So, its integration requires rethinking statistical business processes. But as big data grows and more private actors can generate reliable and timely statistics on social, economic or environmental issues, national statistical offices should embrace big data to remain relevant and expand their contribution to data as a public good.

The UN Committee of Experts on Big Data and Data Science for Official Statistics has been actively supporting statistical communities around the world in addressing the challenges of big data. Recent ESCAP analysis shows how national statistical offices in Asia and the Pacific can use big data for official statistics, including economic statistics,  population and social statistics, environment and agriculture statistics, and for the SDGs.

Several Asia-Pacific Stats Cafes have also showcased country experiences of using big data for statistics. Additionally, the Task Team on Big Data and the SDGs organised a side event at the 52nd session of the UN Statistical Commission to highlight country experiences of using big data for the SDGs.

Irina Bernal is Consultant, ESCAP Statistics Division; Tanja Sejersen is Statistician; Ronald Jansen is Chief, Data Innovation and Capacity Branch, UN Statistics Division; and Niels Ploug is Director Social Statistics, Statistics Denmark. The piece is excerpted from www.unescap.org/blog

Share if you like