big data

One of the features of the post-2015 agenda is that future poverty reduction will occur in a new context of what was called Big Data. Driven by digital technology such as internet, mobile phones, video surveillance and geo-space mapping this new Data Revolution has allowed to produce and process a vast amount of data very quickly, a phenomenon never witnessed before. Global private companies such as Amazon, Google, Facebook or Twitter lead this high speed and large scale accumulation of data. Google processes more than 24 petabytes of data per day a volume that is thousand times the quantity of all printed material in the U.S. Library of Congress. Facebook is a brand new company that already gets more than 10 million new photos uploaded every hour and receives a click on the like button 3 billion times per day. Amazon made a patent on “item-to-item” collaborative filtering using correlations among products to foresee customers’ tastes. Google is able to predict global trends of contagious diseases such as flu or Ebola through search engines. Facebook provide a digital trail of user´s preferences and consumer profiles and Twitter tell us what is in our minds. The amount of stored information grows 4 times faster than the world economy while the processing power of computer grows nine times faster. This fast and vast data undoubtedly represents a precious opportunity to monitor poverty in the future with great potential for partnerships between public entities and private companies. Thus one of the challenges for the future is to promote efficient public-private strategic collaborations that can offer this unique possibility to use private cellphones, GPS, sensors and even web clicks as monitoring tools of poverty globally.

Big Data is not only scale and speed but using the entire random sample. If before samples were took for granted now it is possible to use all the data providing a granular view able to identify subcategories, submarkets and details with  exactitude and less sampling errors. In the past using samples as representative of large number was the result of data scarcity and an artifact to go around informational and technological constraints. While before test hypothesis were defined even before data collection, now we let data speak for itself. This new approach favours the what against the why; more than focusing on causality models we now look at correlations and connections that produce innovative data that we never thought existed. This new mindset based on adaptive dynamics to real life along with the unprecedented scale and velocity of data gathering represents a precious opportunity to have real time data on poverty. For example estimating real time food expenditure based on pay as you go mobile top ups, mapping real time disease trends, such as Ebola or assessing instantly risks and vulnerabilities poor face daily.

But this infatuation with Big Data can have some shortfalls and it needs to be implemented with cautious. No doubt the sample size reduces sampling error but this will not eliminate the bias. A large but biased sample will produce “precisely wrong statistics,” with an extremely small sampling error, but still reflecting biases. For example, mobile phone surveys offer the possibility of much faster, cheaper and more frequent data collection, but it is widely known that the sample of mobile phone users in the developing world is likely to be biased towards wealthier, more educated, younger households, and towards more men than women. One potential pitfalls these tools offer is that we could miss the very people we most seek to reach: those without access to these new technologies as it is very likely that the extremely poor maybe info-excluded from this sample automatically. (Blumenstock and Eagle, 2012). So the Big Data may lead to search answers in places where looking is easiest described by the statiscians as the “Drunkard’s search” to explain this type of observational bias.

“A drunkard is looking for his lost key under a streetlight. A policeman asks “What did you lose”. The man answers “a key, but I can’t find it.” The policeman asks him “Do you remember where you lost the key?” He replies “Yes, over there”. The policeman, who appears confused, asks “Then, why don’t you look for it over there?”. The drunkard answers “because there is no light!”

To address these issues there are proposals of blending the convenience of “Big Data” approaches with the statistical rigor of “Small Data” approaches in what was called the All Data Revolution (Lazer, et. al. 2014). The World Bank Group has promoted various initiatives that precisely blend “Big Data” with “Small Data” for poverty estimation. SWIFT (Survey of Well-being via Instant and Frequent Tracking) is one such initiative. Like typical “Small Data” efforts, SWIFT collects data from samples that are representative of underlying populations of interest. Like typical “Big Data” approaches, SWIFT applies a series of formulas/algorithms, as well as the latest ITS technology, to cut the time and cost of data collection and poverty estimation. For example, SWIFT does not estimate poverty from consumption or income data, which is time-consuming to collect, but uses formulas to estimate poverty from poverty correlates, which can be easily collected. Furthermore, by embedding the formulas into the SWIFT data management system, the correlates will be converted to poverty statistics instantly. To further cut the time for data collection and processing, SWIFT uses Computer Assisted Personal Interview (CAPI) linked to data clouds, and if possible, adopts a cell phone data collection approach. “Big Data” science is still at its early stages and innovations in this field are rolling out at the speed of light, but such innovations might yield entirely new solutions for poverty monitoring in the near future.

For full article please see in Research Essays page.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: