This week, at the WorldBank’s Big Stats meeting ,
they discussed the need for governments to look closely at how big data can be
applied to official statistics. Big data allows for quick results that provide
really interesting answers that come out of questions asked of unstructured
questions. On the other hand, official statistics take sometimes years to compile,
are often out of date, but are usually trustworthy.
But what happens when
you look at frontier markets? Does big data still have the same appeal, since
so much of it is dependent on tracking through social networks accessible by
only some of the population?
Being stakeholder obsessed requires that you know your stakeholder. I strongly advocate for
robust voice of customer (VOC) programs, which can give both qualitative and
quantitative insights into what is truly important. Many customer experience
leaders today rely on increasingly complex forms of VOC feedback, and social
media has been a game changer in this arena.
But in frontier markets,
that isn't always accessible. If you are making $400 a month, and web access
through your mobile costs $40/month, your participation in social networks is
likely to be lower, and big data begins representing stakeholders with higher
income. So people in frontier markets using social media are usually wealthier
than the total population, and would skew accordingly in big data. But that
data is still useful. In 2008, internet penetration covered only 7.9% of the population of Kenya, but Ushahidi was still a valuable resource used to crowd source attack tracking in Kenya
following the 2007 election. Results can be informative even if you don’t have
the most representative sample.
On the other hand, why should
government statistics from frontier markets can be trusted any more than big
data? When I was taking my first quantitative methods course in college, North Korea was easily identified as an unreliable outlier, because their statistics were
always way too rosy to use in any calculations.
But the North Korea
example shows why we need to trust official government statistics—we know North
Korea’s stats are off because they don’t compare to more trustworthy results. Well
calculated official government stats are still extremely important. They serve
as a baseline. Just as we can use them to identify North Korea as an outlier
data point that should probably not be considered, we can also use them to see
how our big data searches compare to traditional statistics.
At the event Paul Cheung
made a great point: government statistics have information depth attached to every
data point (census data on a household), whereas big data has breadth: lots of
data points with comparatively shallow background information (where the 23
year old user was when he liked McDonald’s on Facebook).
For what it’s worth, the Wisdom Network’s 624 Facebook users born in North Korea have the highest affinity with the Dynamo Dresden team. So they like German Football. |
You need two approaches
to information gathering: official statistics need to serve as a baseline to
verify big data, and big data needs to be used to create informative results
quickly for agile responses to issues.
Organizations like the
World Bank can use big data to overcome one of their primary criticisms: they
move too slowly.
As I sat in the meeting
the other day, I kept thinking of the problems that newspapers now face as a
result of the web. Readers have now shifted to blogs and sites that promise
faster information, sometimes sacrificing quality, ethics, and truthfulness. Sources
of official statistics have to speed up their publication times and accept big
data as a complimentary tool they can use, or they might go the way of the
Seattle-Post Intelligencer.
Photo Credit: http://www.sportfive.de/index.php?id=579
No comments:
Post a Comment