It never ceases to amaze me how while everyone is looking for the next big thing, it quietly sneaks up unnoticed all around us. I remember when after the cellphone revolution everyone was looking for the next killer app, predicting amazing things, and then it turned out to be teenager driven SMS texting. And who really saw the coming social networks revolution until we were immersed within it. Not me. And now it is happening again with Big Data.
“Big Data” refers to the mining of the truly massive data sets that now exist in the cloud to come up with new insights. An example frequently used is Google’s Flu Trends map. By mining internet searches, and where and when they occur, the internet search provider is able to map out flu prevalence even before doctors report the cases and the CDC analyzes the resulting data. But this is just an early foreshadowing of what will be possible in the future.
We consumers and businesses have been uploading our information to the “cloud” for years now. Most if not everyone understands that if they put their data in the cloud, it may be less private, but in our minds we think “well, if someone wants to see my Great-Grandma blowing out her 100th birthday candles, have fun.” And, we are not just storing our data in the cloud; we are giving up our rights to our data, including the habits and relationships the go with it. What most do not realize is that storage providers are not simply storing all our content, but also the relationships between the data are being mapped and stored. While in our heads, we continue to picture the cloud as a big, free hard drive in the sky, a closer analogy would be a big social network in the sky.
The leading cloud services today run on relational databases like Giraph that accurately map the relationship between all data stored. For example, the capital of a state is related to all the capitals of other states, as well as the state bird, flower, etc. for the state. A good analogy might be LinkedIn relationships, with degrees of separation listed, and grouping under common interests. Who uploads what content and when yields a lot of additional information, and when multiplied by the millions of people doing it every minute of every day of every year, you have a wealth of new information just waiting to be mined.
When all the past, present, and future world’s information, historical context, data relationship, and internet search history are all overlapped, truly mindboggling capabilities are possible. A somewhat scary prospect that has recently been discussed is that of entire life mapping. Big Data miners can now piece together what someone can expect throughout their life, similar to the board game “The Game of Life.” For example, the odds of someone having cancer based upon family history, uploaded genetic testing for markers and their search history on the term, perhaps cross referenced to where they live and lifestyle choices made. As the number of years since data collection began grows, and granularity in data points increases, it is not unreasonable to expect the accuracy of life maps to grow in tandem. Similar to how guidance counselors today give aptitude tests to high school students and then present them with career options, someday parents might be given potential life maps for their children at birth. While alarmists like to caution about the coming Orwellian society, I am more fearful of a Gattaca future than Big Brother.
It is a bit disheartening to think that our entire lives can be distilled into a collection of digital ones and zeroes. We have free choice after all, or at least the illusion of it for now. It is not unrealistic to contemplate that as computing and storage power continues to grow exponentially, and lifelogging becomes the norm rather than exception, life itself may someday become like a big game of The Sims.