Click here to close now.


SDN Journal Authors: Don MacVittie, Lori MacVittie, Liz McMillan, Dinko Eror, Pat Romanski

Related Topics: @BigDataExpo, Microservices Expo, Containers Expo Blog, @CloudExpo, Apache, SDN Journal

@BigDataExpo: Blog Feed Post

Who Coined the Term Big Data?

I like putting faces to names

I like putting faces to names.

Steve Lohr did the research and wrote an article about the origin of the term Big Data in The New York Times. I couldn't resist the temptation to put faces to the names. Right or wrong, all the facts are from his article.

Big Data History

His first step in the research was to contact Fred R. Shapiro

Fred R. Shapiro is a world-recognized authority on quotations and on reference in general. He edited the award-winning Oxford Dictionary of American Legal Quotations.

But Mr. Shapiro couldn’t find anything … crisp and definitive. The term Big Data is so generic that the hunt for its origin was not just an effort to find an early reference to those two words being used together. Instead, the goal was the early use of the term that suggests its present connotation — that is, not just a lot of data, but different types of data handled in new ways.

Tracing the origins of Big Data points to the evolution in the field of etymology, according to Mr. Shapiro. The Yale researcher began his word-hunting nearly 35 years ago, as a student at the Harvard Law School, poring through the library stacks.

Next he was contacted by Francis X. Diebold

Meanwhile, Francis X. Diebold, an economist at the University of Pennsylvania, got in touch and even wrote a paper, with the mildly tongue-in-cheek title, “I Coined the Term ‘Big Data’ ”. Mr. Diebold staked a claim based on his paper, “Big Data Dynamic Factor Models for Macroeconomic Measurement and Forecasting,” presented in 2000 and published in 2003. But, he later said that his follow-up inquiries proved to be “a journey of increasing humility”!

Important piece of information came from Douglas Laney

Douglas Laney is a veteran data analyst at Gartner. Doug Laney is a research vice president for Gartner Research, where he covers business analytics solutions and projects, performance management, and data-governance-related issues.

His said the father of the term Big Data might well be John Mashey, who was the chief scientist at Silicon Graphics in the 1990s.

John Mashey it was!

John R. Mashey was the chief scientist at Silicon Graphics. He gave hundreds of talks to small groups in the middle and late 1990s to explain the concept and, of course, pitch Silicon Graphics products. Here are the slides of one such talk -  “Big Data and the Next Wave of Infrastress” in 1998. This is what he had to say …

…I was using one label for a range of issues, and I wanted the simplest, shortest phrase to convey that the boundaries of computing keep advancing…

Related Articles

Read the original blog entry...

More Stories By Udayan Banerjee

Udayan Banerjee is CTO at NIIT Technologies Ltd, an IT industry veteran with more than 30 years' experience. He blogs at
The blog focuses on emerging technologies like cloud computing, mobile computing, social media aka web 2.0 etc. It also contains stuff about agile methodology and trends in architecture. It is a world view seen through the lens of a software service provider based out of Bangalore and serving clients across the world. The focus is mostly on...

  • Keep the hype out and project a realistic picture
  • Uncover trends not very apparent
  • Draw conclusion from real life experience
  • Point out fallacy & discrepancy when I see them
  • Talk about trends which I find interesting

@CloudExpo Stories
As a CIO, are your direct reports IT managers or are they IT leaders? The hard truth is that many IT managers have risen through the ranks based on their technical skills, not their leadership ability. Many are unable to effectively engage and inspire, creating forward momentum in the direction of desired change. Renowned for its approach to leadership and emphasis on their people, organizations increasingly look to our military for insight into these challenges.
Achim Weiss is Chief Executive Officer and co-founder of ProfitBricks. In 1995, he broke off his studies to co-found the web hosting company "Schlund+Partner." The company "Schlund+Partner" later became the 1&1 web hosting product line. From 1995 to 2008, he was the technical director for several important projects: the largest web hosting platform in the world, the second largest DSL platform, a video on-demand delivery network, the largest eMail backend in Europe, and a universal billing syste...
Containers have changed the mind of IT in DevOps. They enable developers to work with dev, test, stage and production environments identically. Containers provide the right abstraction for microservices and many cloud platforms have integrated them into deployment pipelines. DevOps and Containers together help companies to achieve their business goals faster and more effectively.
DevOps and Continuous Delivery software provider XebiaLabs has announced it has been selected to join the Amazon Web Services (AWS) DevOps Competency partner program. The program is designed to highlight software vendors like XebiaLabs who have demonstrated technical expertise and proven customer success in DevOps and specialized solution areas like Continuous Delivery. DevOps Competency Partners provide solutions to, or have deep experience working with AWS users and other businesses to help t...
The buzz continues for cloud, data analytics and the Internet of Things (IoT) and their collective impact across all industries. But a new conversation is emerging - how do companies use industry disruption and technology enablers to lead in markets undergoing change, uncertainty and ambiguity? Organizations of all sizes need to evolve and transform, often under massive pressure, as industry lines blur and merge and traditional business models are assaulted and turned upside down. In this new da...
Nowadays, a large number of sensors and devices are connected to the network. Leading-edge IoT technologies integrate various types of sensor data to create a new value for several business decision scenarios. The transparent cloud is a model of a new IoT emergence service platform. Many service providers store and access various types of sensor data in order to create and find out new business values by integrating such data.
Overgrown applications have given way to modular applications, driven by the need to break larger problems into smaller problems. Similarly large monolithic development processes have been forced to be broken into smaller agile development cycles. Looking at trends in software development, microservices architectures meet the same demands. Additional benefits of microservices architectures are compartmentalization and a limited impact of service failure versus a complete software malfunction....
Containers are changing the security landscape for software development and deployment. As with any security solutions, security approaches that work for developers, operations personnel and security professionals is a requirement. In his session at @DevOpsSummit, Kevin Gilpin, CTO and Co-Founder of Conjur, will discuss various security considerations for container-based infrastructure and related DevOps workflows.
The cloud has reached mainstream IT. Those 18.7 million data centers out there (server closets to corporate data centers to colocation deployments) are moving to the cloud. In his session at 17th Cloud Expo, Achim Weiss, CEO & co-founder of ProfitBricks, will share how two companies – one in the U.S. and one in Germany – are achieving their goals with cloud infrastructure. More than a case study, he will share the details of how they prioritized their cloud computing infrastructure deployments ...
There are so many tools and techniques for data analytics that even for a data scientist the choices, possible systems, and even the types of data can be daunting. In his session at @ThingsExpo, Chris Harrold, Global CTO for Big Data Solutions for EMC Corporation, will show how to perform a simple, but meaningful analysis of social sentiment data using freely available tools that take only minutes to download and install. Participants will get the download information, scripts, and complete en...
Between the compelling mockups and specs produced by analysts, and resulting applications built by developers, there exists a gulf where projects fail, costs spiral, and applications disappoint. Methodologies like Agile attempt to address this with intensified communication, with partial success but many limitations. In his session at DevOps Summit, Charles Kendrick, CTO and Chief Architect at Isomorphic Software, will present a revolutionary model enabled by new technologies. Learn how busine...
Interested in leveraging automation technologies and a cloud architecture to make developers more productive? Learn how PaaS can benefit your organization to help you streamline your application development, allow you to use existing infrastructure and improve operational efficiencies. Begin charting your path to PaaS with OpenShift Enterprise.
Internet of Things (IoT) will be a hybrid ecosystem of diverse devices and sensors collaborating with operational and enterprise systems to create the next big application. In their session at @ThingsExpo, Bramh Gupta, founder and CEO of, and Fred Yatzeck, principal architect leading product development at, discussed how choosing the right middleware and integration strategy from the get-go will enable IoT solution developers to adapt and grow with the industry, while at th...
Data loss happens, even in the cloud. In fact, if your company has adopted a cloud application in the past three years, data loss has probably happened, whether you know it or not. In his session at 17th Cloud Expo, Bryan Forrester, Senior Vice President of Sales at eFolder, will present how common and costly cloud application data loss is and what measures you can take to protect your organization from data loss.
The Internet of Things (IoT) is growing rapidly by extending current technologies, products and networks. By 2020, Cisco estimates there will be 50 billion connected devices. Gartner has forecast revenues of over $300 billion, just to IoT suppliers. Now is the time to figure out how you’ll make money – not just create innovative products. With hundreds of new products and companies jumping into the IoT fray every month, there’s no shortage of innovation. Despite this, McKinsey/VisionMobile data...
The web app is agile. The REST API is agile. The testing and planning are agile. But alas, data infrastructures certainly are not. Once an application matures, changing the shape or indexing scheme of data often forces at best a top down planning exercise and at worst includes schema changes that force downtime. The time has come for a new approach that fundamentally advances the agility of distributed data infrastructures. Come learn about a new solution to the problems faced by software organ...
“In the past year we've seen a lot of stabilization of WebRTC. You can now use it in production with a far greater degree of certainty. A lot of the real developments in the past year have been in things like the data channel, which will enable a whole new type of application," explained Peter Dunkley, Technical Director at Acision, in this interview at @ThingsExpo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
There are many considerations when moving applications from on-premise to cloud. It is critical to understand the benefits and also challenges of this migration. A successful migration will result in lower Total Cost of Ownership, yet offer the same or higher level of robustness. Migration to cloud shifts computing resources from your data center, which can yield significant advantages provided that the cloud vendor an offer enterprise-grade quality for your application.
JFrog has announced a powerful technology for managing software packages from development into production. JFrog Artifactory 4 represents disruptive innovation in its groundbreaking ability to help development and DevOps teams deliver increasingly complex solutions on ever-shorter deadlines across multiple platforms JFrog Artifactory 4 establishes a new category – the Universal Artifact Repository – that reflects JFrog's unique commitment to enable faster software releases through the first pla...
IT data is typically silo'd by the various tools in place. Unifying all the log, metric and event data in one analytics platform stops finger pointing and provides the end-to-end correlation. Logs, metrics and custom event data can be joined to tell the holistic story of your software and operations. For example, users can correlate code deploys to system performance to application error codes.