Happy new year everyone! In this blog post, I’ll be providing a review of Big Data by Viktor Mayer-Schönberger and Kenneth Cukier.
Rather than focusing on one specific company or industry, this book revolves around “big data” and its effects on society as a whole. Before going any further, then, the authors explain what “big data” actually means: it refers to the exponential growth of stored information and quantification of our world that we are currently undergoing. As the storage capacity of technology increases and the cost of adding data decreases, companies find themselves with a greater amount of information than ever before. Simultaneously, more and more aspects of our daily lives are being broken down to numbers, such as the places we go and the media we consume. In turn, Mayer and Cukier argue that we are gradually transitioning to a new era of humanity altogether, where data will completely alter how we understand our surroundings.
The book then segues into how big data will create such monumental changes. As the first primary difference, more information means stronger predictions and greater insights. Rather than using a sample to represent a population, as we used to do in a “small data” world, we now have the means to reach entire demographics. Technology has expanded to the point where n = all is no longer unrealistic, and we can now do away with the intrinsic biases and uncertainties of sampling. Secondly, the sheer quantity of information will inherently make data much messier. Spreadsheets may not be perfectly organized, or surveys may include hundreds of different answers, but the chaos is something we must come to accept. In learning to abandon our desires for precision and exactitude, we will allow ourselves to fully make use of all the new information. Lastly, we will begin to prioritize correlation over causation. In other words, caring more about what is happening than why. When Billy Beane’s “Moneyball” team wanted to win more baseball games, they simply looked for players who got on base the most, and never concerned themselves with how they actually managed to do so. A hit, walk, and error all led to a favorable outcome, and big data will lead us to think about our own decisions in the same way. Thus, with the proper mindset and approaches, these changes can lead to significant improvements in the overarching quality of our society.
Placed in the wrong hands, however, big data can also become equally as problematic. Consequently, the authors spend the remainder of the book addressing these potential issues and offering solutions to combat them. Starting on an individual level, the heightened amount of available information will increasingly threaten our privacy. Facebook already knows a majority of our social lives, Google Earth has our homes digitized online, and similar advancements will continue to emerge. As we utilize more services on the Internet and more of our actions become quantifiable, less and less of our personal experiences are protected. So, in order to prevent that conflict, Mayer and Cukier advocate for an entirely new system of privacy standards: instead of requiring consumers to provide their consent, the regulations should focus on companies using the data and ensuring proper treatment. This shift would liberate the companies to explore big data while still maintaining users’ safety and confidentiality. Next, on a societal level, they warn of a future where data-based predictions are so strong that individuals receive unfair treatment because of what they may end up doing. People could be rejected from statistically unfit jobs, refused insurance as expectedly poor drivers, or even prosecuted for crimes that they are highly likely to commit. Of course, however, this is problematic and borders on denying humans of their free will. As a result, the authors suggest creating a new governmental agency that monitors information use and keeps these events from unjustly taking place. Altogether, then, these solutions would allow us to benefit from big data’s possibilities without falling prey to its potentially harmful aspects.
I found Chapter Six, which delves into the future economic value of big data, particularly interesting and relevant to our course. In this section, the book argues that a company’s data will soon become just as much of an asset as its equipment or inventory. For some, that process has already begun: companies like Farecast, which uses millions of previous flights to offer low airline prices, as well as Inrix, which analyzes car movements to provide real-time traffic reports, could not operate without their data. This even holds true for the larger organizations, as Facebook, Google, Salesforce, and Zynga (among others) derive much of their profitability from the information that they collect. Through selling it to other entities or using it to improve their own products, such organizations have come to view their analytics as a vital resource. These are the exact companies that we will be visiting on our field study, and not only have they embraced the future of big data, but they have also clearly learned how to monetize it.
In acknowledging this new facet of business, the authors illustrate its likely effects, the first of which being a new marketplace for the exchange of data. Individuals and companies will begin to pay for certain information from each other, and new platforms and regulations will emerge to facilitate that. Equally as expected is the rise of various startups dedicated to analytics, which can fill the gaps of data in our current society and do so for value (like Farecast and Inrix). Lastly, as we have recently begun to experience, even the most old-fashioned of industries will be disrupted by information technology. Big data will gradually seep into the core of areas such as agriculture, alcohol, and entertainment. With this said, I felt that Chapter Six offered many key insights into the future of big data in our global economy as well as in the companies that we will be studying.
I would personally recommend reading this book, but only with a very specific mindset. Unfortunately, in terms of the writing, its 200 pages are executed relatively poorly: there are misspelled words, grammatical errors, and syntactical issues that more editing could have removed. Moreover, the book is extremely repetitive, often hammering down points with a seemingly endless amount of analogies. I think that the authors could easily have cut its length in half and still conveyed the exact same messages. These various flaws, in my opinion, make the reader’s experience less enjoyable overall.
Nevertheless, despite those stylistic issues, the book is still extremely informative. It covers all of its bases, never shying away from complex topics and using concrete evidence to support all of its claims. For such an ambiguous and far-reaching term, the authors make “big data” easy to comprehend and articulate. They also go a step further than simply laying out its pros and cons, additionally demonstrating how to optimize those benefits and minimize those detriments. For these reasons, I imagine that this book is the best current option for anyone looking to learn about big data’s foreseeable impacts.
Therefore, I would not advise reading this book for pleasure – its writing will bog you down and drag you through an onslaught of typos and redundant examples. However, if you are able to look past that, and open it specifically with the goal of gaining information on big data, Mayer and Cukier will certainly not disappoint.