humans are terrible and wonderful Big Data systems

Recently I’ve been deep diving on the topics of big data and analytics. For the benefit of non-technical family members who read this blog, let me give two quick layman’s definitions:

Big data simply refers to massive data sets and the techniques to make sense from these massive data sets. A good early example of harnessing big data was when Google realized that there was value in being able to crawl and index the entire web’s contents [1].

Analytics is simply the art and science of harnessing data to learn things, identify problems, identify opportunities, and to predict things in a way that is hard to do with simple querying. It’s a bit of a fuzzy term. For instance, I wouldn’t consider it analytics that Amazon.com shows me my recent orders, but I would consider it analytics that Amazon.com makes scarily accurate recommendations based on what I’ve ordered and viewed in the past vs. what every other customer has ordered and viewed in the past [2].

I am actually quite a novice when it comes to both of these topics, but when I was thinking about leaving the Jazz team last Fall, my friend and mentor Rod Smith made the suggestion that I work in an area where I’m a total novice as a way to stretch myself. This turned out to be a fantastic bit of advice as I have learned more in the past four months than I learned in the last several years on Jazz, simply because I had become an expert in my focus areas there.

Anyhow, that’s not the topic of this journal entry but it’s relevant. Because I am most assuredly not an expert on these topics, I have been trying hard to learn from people who understand these topics much better than I do. One person I found [3] who has been a great help is Jeff Jonas of IBM’s Information Management division. Jeff has an amazing set of blog entries on topics related to analytics mostly, but also big data [4].

Today we were chatting on the phone and he told me I should explore a particular topic. I told him I had and offered to forward him an email with more details on the topic. He immediately said “NO, PLEASE DON’T”. My initial assumption was that he was worried about IP contamination, but it turned out that he simply gets too much email and as long as I was aware of the topic, he was happy to leave it at that.

This made me smile because one of my current understandings about big data and analytics is that more data is always a good thing, even if it’s bad. Jeff even makes this point in one of his blog entries. But it made me smile because I have the same habit. Like everyone, I struggle with information overload and do what I can to limit my consumption to interesting new ideas or new analysis that helps refine or connect known ideas. So from this point of view, humans are poor big data systems because we simply can’t handle large volumes of incoming data.

But this mode of thinking is obviously shortsighted and does a big disservice to our biological and cultural achievements. We are the ones who create things and we can only do this because of our brains’ wonderful abilities to synthesize I have no idea how much knowledge and sensory input that we acquire through a lifetime of living.

It’s interesting to think about how our brains are better and how the Amazons and Google big data systems of the world are better in terms of making sense of it all and forming new ideas [5].

Footnotes

[1] Re: Google indexing the entire web, I believe that because Google has become such a ubiquitous part of first-world culture, we rarely think about what a massive technical achievement that was. Recently I’ve been reading Steve Levy’s excellent new book “In the Plex” which helped remind me of the magnitude of this achievement.

[2] I’m sure Amazon’s recommendation system is actually quite a bit more involved than this.

[3] The story on how I “found” Jeff is actually quite ironic considering the topic of this blog entry. I didn’t find him through any sort of enterprise knowledge management analytics system, I found him because my friend James Governor or RedMonk tweeted about having beers with Jeff in Vegas. Big data and analytics are great, but there is still a place for personal relationships and serendipity.

[4] You may actually also know Jeff from this “Smarter Planet” ad.

[5] This also reminds me of a chat with a friend who went to Google in the mid-2000s. I asked him what he thought and he said “I think we’re building Skynet“. I laughed. He then said, “No I’m serious, I think we’re building Skynet.” Hopefully it concurs with the “Don’t be evil” bit 🙂

This entry was posted in Uncategorized and tagged , , , , , , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published.