The answer I’m sure is innovation.
Practically the first thing to do is figure out the questions to ask. Don’t stick to just questions that are hanging out there already needing to be answered, but create new questions that you couldn’t answer before you had your Big Data. Don’t forget that the data you have isn’t limited to what’s in-house, you can find and mashup “tons” of public, government, and licensed data sets.
Data mining, just like data visualization, is as much art as science…
When You Have a Traditional Question, All Data Looks Traditional
Old Mine - Image by OSU Special Collections & Archives via Flickr
Is the challenge simply to map and reduce the Big Data into smaller data so we can look at it the same way we always have? So we can support the same business processes, the same decision-making? Answer the same questions but at larger scale perhaps?
The real challenge to think differently – to ask different questions that can only be answered by unlocking the Big Information spread over the Big Data. The whole process from data gathering through mining, analysis and visualization and presentation needs to be designed to help create and answer these new and different questions.
Despite bigger and bigger data, the world is a small place and it is full of people. Increasingly networked people. I like Clay Shirky’s thinking in Here Comes Everybody about new ways people online can gather and form loose communities whose effectiveness is multiplied by new found freedoms and capabilities for distributed but coordinated group action. (Twitter doesn’t topple governments, people linked by Twitter do.)
In Cognitive Surplus he writes about the ability to harness huge untapped human potential. For example, the average Westernized civilization’s tuned-out TV time represents a significant amount of lost “cognition”. If it were possible to recover just a small percentage of that wasted human capital in the pursuit of just about anything, tremendous things could happen. Given the emerging abilities of internet societies to both encourage and allow everyone to contribute, we might be at the start of a tremendous acceleration in human achievement (e.g. see how online gamers solve aids protein puzzle).
Image by bass_nroll via Flickr
It is no longer news that companies can (and must) look for competitive advantage and innovative, even disruptive, opportunities in their “big data”. We are flooded daily with press releases about new big data technology, much of it designed to make the analysis and visualization of big data easier – even for the non-data scientist. You might even call 2011 the start of a renaissance for data visualization gurus and infographic artists. (And we are seeing data mining history being rewritten to cast any past complex analysis victory as a win for “big data”.)
But not that much is being said about the human psychology around big data analysis. Maybe a few cautionary stories about ensuring good design and not intentionally lying with big data stats (the bigger the data, the bigger the potential lie…). And some advice that the career of the future is “data scientist,” conflicting with emerging technology marketing hype indicating we won’t really need them.
The world is changing for the people who live here but we talk mostly about gadgetry.