The Big Picture of Big Data

Is it hype or something more?

29 September 2014

Photo: iStockphoto

As someone who gives a lot of presentations about the future of technology, I routinely ask my audiences—largely managers and business professionals—whether they have heard about big data. Most raise their hands. But when I ask how many of them know exactly what “big data” means, hardly any hands go up. Such notoriety without knowledgeability is a common characteristic of newly-voiced ideas.  


In an address to the U.S. National Research Council (NRC) on 11 January 2007, Jim Gray, Microsoft’s senior scientist, argued that, “With an exaflood of unexamined data and teraflops of cheap computing power, we should be able to make many valuable discoveries simply by searching all that information for unexpected patterns.” He called such pattern searches “data-intensive scientific discovery,” and he proposed that the methodology be formally acknowledged as a fourth paradigm of scientific research, in addition to observation, experimentation, and computer simulation.

Jim Gray disappeared while sailing off the California Coast three weeks after his NRC presentation and has never been found. As a spontaneous tribute, his colleagues created a website on which dozens of scientists and scholars posted papers supporting Gray’s concept, which they coined “big data.” Those papers were published in 2009 in the book: The Fourth Paradigm: Data-Intensive Scientific Discovery. A year later, the NRC formally designated the big-data concept as the fourth paradigm of scientific research.


Since then, the term has gotten traction. During the 2012 World Economic Forum, a gathering of international leaders in academia, business, and politics, one keynote speaker described big data as “a new class of asset, like currency or gold.”

In 2012, Gartner, an IT research firm, forecasted that by 2015 big data would directly generate 1.9 million new jobs, and indirectly generate 5.7 million additional new positions. And, more recently, the director of Harvard University's Institute for Quantitative Social Sciences has written that “the march of quantification, made possible by cloud computing and enormous new sources of data, will sweep through academia, business, and government; no area will be left untouched.”

From the outset, the behavioral and social sciences have been particularly energized by the research possibilities offered by big-data pattern searches.  So have the folks in marketing, who are confident that it will enable them to discover many useful insights into consumer behavior and motivation.

The 24 July 2014 issue of InfoWorld Tech Watch cited a recent survey by Gartner, which found that 64 percent of large enterprises are investing in so-called big data, but that 60 percent of them admitted they don’t have a clue as to what to do with it yet. For its indiscriminate enthusiasm by corporations, the term “big data” is now at the top of what Gartner refers to as the “hype cycle.”

But some have figured out what to do with data. For example, one U.S. national retailer is reported to have discovered that a sudden increase in the purchase of cotton balls is a remarkably accurate predictor of impending pregnancy among women of a certain age and has begun marketing appropriate products to them. Such transparent, sometimes shameless use of data-mining discoveries in consumer marketing is already eliciting increased concerns by privacy advocates.


The use of data-intensive discovery by the social and behavioral sciences, however, has already begun to validate long-standing theories and reveal unexpected truths. For example, family counselors have long reported anecdotal information that disagreements about money are the most common source of friction between spouses. Now, a big-data pattern search has shown that couples who have widely differing credit scores when they marry have much higher divorce rates than do couples with equally high credit scores.

Evolv, a San Francisco company that helps companies improve their workforce using analytics, used pattern searches to discover that a number of commonly used job-screening criteria—such as long periods of unemployment, frequent job-hopping, felony convictions, and low intelligence test scores—are not, contrary to conventional wisdom, accurate predictors of workplace performance.

These are just two examples of how big data can disprove theories and shed new light onto matters we thought we understood. Policymakers, economists, and futurists like me derive our models of reality from the scattered points of focused scholarly information, while filling in the unknown with our (more or less) informed intuition.


Jim Gray’s data-intensive scientific discovery, on the other hand, has the potential to produce many insights from a single pattern search of well-chosen data sets.

In fact, as data-intensive scientific discovery will inevitably be applied across all domains of private and public enterprise, big-data pattern searches will almost certainly call conventional wisdom into question in every field and endeavor.

In education, big-data analyses of the outcomes of schooling will ultimately tell us what mix of curricular content and methods of delivery produce the best results. As the new regimen for medical services become better informed by big-data assessments of patient outcomes, the quality of health care will rise as costs fall.

In every institution, practitioners and decision-makers will increasingly be awash in the discoveries produced by big data, some of which will confirm traditional predispositions, and some of which will not.

Of course, all these data outputs won’t magically appear on decision-makers’ desks. They will be mined and refined by armies of data scientists, and interpreted and applied by cadres of math modelers and quantitative analysts. In the next five years, there will be cultural conflict between the increasingly powerful “quants”—quantitative analysts—and traditional managers.

Erik Brynjolfsson, director of the Center for Digital Business at MIT’s Sloan School of Management, has spent decades assessing the impact of IT on economic performance. He has recently demonstrated that companies that adopt data-directed decision-making enjoy a 5 to 6 percent boost in productivity. To maintain its legitimacy, 21st century management will have to embrace the newly established paradigm of data-intensive scientific discovery.

blogSnyder Photo: David Pearce Snyder

David Pearce Snyder is a consulting futurist, and has been a contributing editor for The Futurist magazine since 1979.  He will be speaking at MiniTrends 2014, a futures symposium in Austin, Texas, on 26 September.

IEEE membership offers a wide range of benefits and opportunities for those who share a common interest in technology. If you are not already a member, consider joining IEEE and becoming part of a worldwide network of more than 400,000 students and professionals.

Learn More