IEEE Member Mark Davis has devoted the last two decades to advancing technologies for information retrieval, machine learning, and natural-language processing—key elements needed to process and prioritize today’s mountains of data. So it only makes sense that he would also be a driving force in the big-data movement.
Davis is the leader of the IEEE Cloud Computing Initiative’s big data track. The initative is part of the Future Directions Committee, IEEE’s R&D arm. In this role, he advances technologies such as storage and databases to keep up with the growing amount of data collected from sensors, smart devices, and other things. He also works to implement standards and is looking into how to prevent security breaches and other unintended consequences that may arise as this field continues to grow.
Davis is also helping to launch IEEE Transactions on Big Data, which highlights related research within IEEE. It’s expected to debut early next year.
Outside of his volunteer work with the organization, he is a distinguished engineer and leader of big-data technologies for Dell Software, in Santa Clara, Calif.
“Traditional databases and data warehousing technologies are not able to cope with the scale, variety, and complexity of data in the modern world,” he says. “My entire career has been dedicated to the creation and expansion of intelligent systems.”
Technologies used for massaging big data can more intelligently piece together kernels of information in what’s collected, which can greatly improve smart systems such as smart grids and make ads—like them or not—more personal.
DEFINING BIG DATA
For Davis, the term “big data” has become hyped without a clear meaning. “The broader definition of big data is data that is too large and complex to be handled by traditional databases,” he explains. “Those of us who work in this field cope with scale by processing chunks of data independently and then merging the results. We cope with variety by using new techniques to enrich data, such as natural-language processing and machine learning methods that produce new metadata based on an intelligent analysis of content.”
Databases, for example, are used to complete bank transactions, which in a simple form help tally the balance of an account. By comparison, big-data tools do much more. Search engines like Google and Yahoo collect data virtually every time a pin drops on the Internet—say, when an e-mail is written or a particular term is searched. The engines then transform that data into meaningful information.
HERE COMES THE FUTURE
For Davis, an interest in engineering began at home. His father was an electrical engineer and owned an HP 3000 computer. “When I was 10, he programmed a game that guessed the animal you were thinking of. It was a simple intelligent system, but it demonstrated to me the potential of technology,” Davis says.
He went on to earn a bachelor’s in electrical engineering from New Mexico State University, in Las Cruces, in 1988. After graduating, he went to Fiji for a two-year stint in the Peace Corps, teaching high school physics. He returned to his alma mater in 1994 to pursue a master’s in EE, after which he spent five years in computer R&D for the university’s Computing Research Laboratory, with funding from the U.S. intelligence community. (The lab is no longer in operation.)
“I started working in information retrieval and natural-language processing systems in the ’90s, when the only people interested in those topics were librarians and spies,” he says with a laugh.
In 1999, he joined Microsoft, in Redmond, Wash., as a program manager working on SharePoint, a portal technology with a search engine. He left a year later to join InXight—a spin-off of Xerox PARC, in Palo Alto, Calif.—as principal engineer. It was at this point that he began focusing on emerging technologies for big data.
“Big data, as we see it today, can be traced back to when Google was revamping its indexing of the Web using a new style of computing,” he says. “The upshot was that this method could drive context-aware advertising, e-mail spam detection, and even automatic language translation.”
In 2004, Davis founded Kitenga, a Santa Clara, Calif., start-up that developed natural-language processing and machine learning applications, enabling computers not only to understand human language but also to build upon what was learned. He joined Dell in his current position when it bought Kitenga in 2012.
That same year, IEEE asked him to join what became its Cloud Computing Initiative to focus on the growing area of big data. He’s now investigating big data in terms of cybersecurity due to the explosion of data-gathering devices.
Davis’s rule of thumb: Be wary of how the expansion of big data will affect our security, but don’t obsess about it.