"Big Data" has all of the characteristics of your typical business buzz word: fairly new, widely used, and almost universally misunderstood. So what is big data and how is it useful? In order to investigate, I read the book that seems to be the definitive guide for business people, Big Data: A Revolution That Will Transform How We Live, Work, and Think by Viktor Mayer-Schonberger and Kenneth Cukier.
Unfortunately, even this book admits that there is no formal definition of big data. There are, however, some guidelines. The idea is that data collection has become so ubiquitous that the volume is great enough that traditional database management systems cannot handle it. From a technical standpoint, new technologies are needed just to extract basic information from this data.
The main focus of the book is the three shifts of mindset that must occur in this new world of data. First, is the ability is there for vast amounts of data to be analyzed about topics, rather than small subsets, prior to making decisions (seems intuitive enough, right?). The second is a willingness to settle for less precision in data collection with big data, as these small mistakes presumably will not matter as much as in the past. And the third is an understanding that correlations, rather than causality, might be the best conclusions we can draw.
Overall, the book is a fairly short read with plenty of examples. It does not purport to be evangelizing big data. The simple fact is that big data is available and smart companies are already using it, so being familiar with it is imperative for data-driven decision making. As a former database administrator, I found the sections describing "noSQL" very interesting. Essentially, this requires no preset structure in order to work (easy to see why they titled it "No structured query language"). There are also some serious privacy concerns that are raised. Keep in mind, this was written prior to information of NSA e-mail reading was made public. Though the focus of big data is drawing overarching conclusions about groups of people rather than individuals, the point is taken that almost everything we do is now being recorded and stored. The book even hinted at a future similar to Minority Report, where big data is used to predict crimes before they happen. Somewhat scary to think about; at least, I'm assuming that was the thrust of Minority Report. I saw it in theaters a decade ago and can barely remember.
I definitely recommend the book for anyone working in business analytics. In some ways, big data does make me very excited (even though it may render my SQL skills useless). I love doing things more efficiently and intelligently and big data provides that ability for marketers and business people. However, the risks are great and the privacy concerns associated with data collection and potential for use by governments is of great concern. Let's hope it's more of the former than the latter.