What Do You Mean, Big Data?
Big Data is an inescapable buzzword for anyone even remotely entrenched in the world of business or technology in 2016. With the phrase being rather ubiquitous and generic without a proper context, it might be hard to understand exactly what this movement is all about. Big Data as it exists today is a growing and emergent field. Technology is constantly being introduced to the world that either creates new data streams or a new way to make sense of the current data being produced. What, exactly, should we do with all this information? That is the central problem and proposed solution of Big Data; to capture, analyze, and make sense of the wealth of data at our disposal.
The first recorded attempts to quantify and explain large and exponentially growing sets of data came in the 1940s when Freemont Rider forecasted exponential growth for libraries. Libraries had been doubling each year, and at the rate he observed he predicted that library staffs would require an outrageous number of employees to maintain the volumes. In 1961 Derek Price makes a similar observation about the growth of recorded scientific knowledge, and also states that this information grows exponentially as each scientific discovery spawns another series of discoveries. It seems that placing an emphasis on capturing and understanding data and overcoming the complexities involved in that process are not new ideas.
In 1990 Peter Denning asked the question “What machines can we build that will monitor the data stream of an instrument, or sift through a database of recordings, and propose for us a statistical summary of what’s there? … it is possible to build machines that can recognize or predict patterns in data without understanding the meaning of the patters. Such machines may eventually be fast enough to deal with the large data streams in real time … With these machines, we can significantly reduce the number of bits that must be saved, and we can reduce the hazard of losing latent discoveries from burial in an immense database.” The idea of making sense of Big Data in a modern way isn’t a brand new idea either. The blueprint of the modern notion and science of Big Data has seemingly been around for a few decades.
Today, data originates from an increasing number of places. Data can come from our internet browsers, PCs, mobile devices, fit bits, cameras, microphones, and wireless networks. There are new sources that create and transmit data being introduced to the market all the time. In examining all the places where data could come from, it seems like an endless trough. Individuals and companies want to be able to harness and make sense out of all this data. The purpose would be to observe patterns, analyze trends, and hopefully predict outcomes based on past interactions. Every company in every industry can see a value in being able to capture and quantify the increasingly available streams of data produced by our electronic devices.
In 2015, 90% of organizations reported investments in Big Data initiatives, and two-thirds of those organizations claimed that these initiatives have a measurable impact on revenue. In the current climate, organizations see Big Data and analytics as a way to gain a competitive advantage. Companies that are capitalizing on Big Data initiatives the most are looking beyond transactional data and using the data they collect in another way such as creating new business models or monetizing data to external companies. The most used types of data are location data and text data. This is Big Data as we see it today: leveraging technology to analyze new data streams, and fine-tune business models based on the findings. As we have already addressed, this isn’t a new concept, just the latest iteration.
A concern with Big Data moving forward is the idea of privacy and how much of it we are willing to compromise in the pursuit of gathering more and more data. Analyzing data that is being captured passively or without the consent of the individual brings up privacy concerns, although it could provide some very useful insights into trends and behaviors. The Organization for Economic Cooperation & Development’s privacy guidelines also dictate that data should be discarded once its original purpose is achieved. The idea of discarding this data goes against a central tenant of Big Data. Besides the primary problems facing Big Data, which is the ability to capture all the data and having a means of making sense of it all, privacy concerns for individuals and companies are in the forefront of issues to be addressed.
Other questions for Big Data moving forward are where will data sets come from and how we will capture and analyze them. With technology changing at an ever increasing rate it’s almost impossible to forecast the what and how of Big Data initiatives. What is likely to remain true is that the amount of data we will produce will continue to increase, and we will continue to place a premium on gathering and understanding that data. The philosophy of Big Data is an old one; gather data and analyze it in a way that presents some sort of competitive advantage. The execution of the initiatives is something that is fluid, dynamic, and challenging.