Hi Everyone! I selected the book “Big Data: A Revolution that will Transform how we live, work, think“ by Viktor Mayer-Schönberger and Kenneth Cukier to read. I chose this book because I had heard the term “big data” used very often, but struggled to understand what it meant exactly. Fortunately, this book was able to provide a lot more clarity because of how it broke down each aspect of big data, explained it, and provided many examples allowing me to see how it is applied in actuality.
The book begins by using Google’s Flu Trends, how Google is able to use search queries to predict flu outbreaks, as an example of the application of big data. While this information is not always exact, a characteristic of big data, it is “good enough.”
Big data is a term for large amounts of data, such as Google’s 3.5 billion search queries a day. This is because the larger amount of data there is, the more information that can be extracted from it. Big data can be used in more ways than it was initially collected for and frequently its secondary purpose tends to be more useful. Additionally, the more data there is, the less precise or more “messy” that data can be. If you have a small set of data, each piece needs to be very precise, but if you have a large set of data, the general trend will prevail either way. Having a higher amount of data is more important than its exactitude because the tools used to measure, record, and analyze the data are also imperfect, making “messiness a practical reality we must deal with” (41). Lastly, the goal of big data is to discover correlation instead of causation. Discovering causation answers “why” while correlation answers “what” which is “good enough” and serves its purpose.
Later in the book, the implications of big data was discussed. There are three steps in implicating big data: collecting the data, having the skills to analyze it, and having the mindset and knowledge to apply it. Most big-data companies embody one of these skills, but the most powerful companies have all three such as Amazon and Google. Google, for example, collects their search-query typos, has the idea to create a spell checker, and has the in-house skills to execute it (132).
As with anything, there are risks to using big data that threaten user’s privacy and free will. The movie Minority Report is used as an extreme possibility, suggesting that as big data makes increasingly accurate predictions, people will be prosecuted based on the likelihood that they would commit a crime in the future, even if they have not yet actually done it.
The book concludes by mentioning ways to control the risks of big data and what the future of big data looks like. This includes companies hiring individuals to advocate on behalf of the users, ensuring that people will want to continue using the company.
Interesting (or scary?) Topic
I found the risks of big data to be the most interesting to me. The most obvious risk about big data is the threat to one’s personal privacy, a risk I have heard debated by many. However, the depth of the tracked data was very shocking to me. Aside from Google searches and Facebook likes, the amount of heat you use in your house and how much money you spend at the gas station is recorded. After reading this book I understand how that data is useful to companies, but it does make me feel uneasy knowing that most aspects of my life are being recorded. Although companies will make this type of information anonymous by using a unique identification code, attaining anonymity in big data is almost impossible. With all the data that surrounds an individual, even if only from their Google searches, it is not hard to identify whom it is.
Currently, small data allows for profiling groups of people. However, the goal for big data is to be able to profile individuals, rather than groups, allowing for more accuracy. This is concerning because this type of profiling is being used in various ways, including to determine how likely someone is to commit a crime based off of various algorithms. I have seen the movie Minority Report and the thought of prosecuting someone for something they have not done is scary. Aren’t people supposed to be innocent until they actually do something? Yes it would be better for society to prevent bad things before they happen, but that also puts free will at risk.
Would I Recommend?
Overall, I would recommend this book. As mentioned above, I selected the book with the intention of learning more about big data and I can say that I am confident in my understanding of big data. The way that Mayer-Schönberger and Cukier focused each chapter on an isolated aspect of big data helped me develop a strong understanding of each subject through their use of intriguing examples and explanations. They strategically referenced subjects from other chapters, but did it in a way that helped me further understand the topic at hand. I also found it interesting to learn how Google used big data to develop their spell checker and how reCAPTCHA (the system that asks you to type the squiggly letters in the box to make sure you’re not a robot) uses their platform to improve the digitization of books. I had no idea how much big data influences our daily life nor did I know about the countless ways the information is used.
If you are someone who does not feel like they fully understand the concept of big data, this is the book for you. Also, you are not expected to have any prior knowledge of big data before reading this book. Everything is clearly explained and laid out for you.