SDN Journal Authors: Liz McMillan, TJ Randall, David Paquette, Hollis Tibbetts, Lori MacVittie

Related Topics: @BigDataExpo, Microservices Expo, Containers Expo Blog, @CloudExpo, SDN Journal, @DevOpsSummit

@BigDataExpo: Article

Shots Across the Data Lake

Big Data Analytics Range War

Range Wars
The settling of the American West brought many battles between ranchers and farmers over access to water. The farmers claimed land near the water and fenced it to protect their crops. But the farmers' fences blocked the ranchers' cattle from reaching the water. Fences were cut; shots were fired; it got ugly.

About a century later, with the first tech land rush of the late1980s and early '90s - before the Web - came battles between those who wanted software and data to be centrally controlled on corporate servers and those who wanted it to be distributed to workers' desktops. Oracle and IBM versus Microsoft and Lotus. Database versus Spreadsheet.

Now, with the advent of SoMoClo (Social, Mobile, Cloud) technologies and the Big Data they create, have come battles between groups on different sides of the "Data Lake" over how it should be controlled, managed, used, and paid for. Operations versus Strategy. BI versus Data Science. Governance versus Discovery.  Oversight versus Insight.

The range wars of the Old West were not a fight over property ownership, but rather over access to natural resources. The farmers and their fences won that one, for the most part.

Those tech battles in the enterprise are fights over access to the "natural" resource of data and to the tools for managing and analyzing it.

In the '90s and most of the following decade, the farmers won again. Data was harvested from corporate systems and piled high in warehouses, with controlled accessed by selected users for milling it into Business Intelligence.

But now in the era of Big Data Analytics, it is not looking so good for the farmers. The public cloud, open source databases, and mobile tablets are all chipping away at the centralized command-and-control infrastructure down by the riverside.  And, new cloud based Big Data analytics solution providers like BigML, Yottamine (my company) and others are putting unprecedented analytical power in the hands of the data ranchers.

A Rainstorm, Not a River
Corporate data is like a river - fed by transaction tributaries and dammed into databases for controlled use in business irrigation.

Big Data is more like a relentless rainstorm - falling heavily from the cloud and flowing freely over and around corporate boundaries, with small amounts channeled into analytics and most draining to the digital deep.

Many large companies are failing to master this new data ecology because they are trying to do Big Data analytics in the same way, with the same tools as they did with BI, and that will never work. There is a lot more data, of course, but it is different data - tweets, posts, pictures, clicks, GPS, etc., not RDBMS records - and different analytics - discovery and prediction, not reporting and evaluation.

Successfully gleaning business value from the Big Data rainstorm requires new tools and maybe new rules.

Embracing Shadows
These days, tech industry content readers frequently see the term "Shadow IT" referring to how business people are using new technologies to process and analyze information without the help of "real IT".  SoMoClo by another, more sinister name.  Traditionalists see it as a threat to corporate security and stability and modernists a boon to cost control and competitiveness.

But, it really doesn't matter which view is right.  Advanced analytics on Big Data takes more computing horsepower than most companies can afford.  Jobs like machine learning from the Twitter Fire Hose will take hundreds or even thousands of processor cores and terabytes of memory (not disk!) to build accurate and timely predictive models.

Most companies will have no choice but to embrace the shadow and use AWS or some other elastic cloud computing service, and new, more scalable software tools to do effective large scale advanced analytics.

Time for New Rules?
Advanced Big Data analytics projects, the ones of a scale that only the cloud can handle, are being held back by reservations over privacy, security and liability that in most cases turn out to be needless concerns.

If the data to be analyzed were actual business records for customers and transactions as it is in the BI world, those concerns would be reasonable.  But more often than not, advanced analytics does not work that way.  Machine learning and other advanced algorithms do not look at business data. They look at statistical information derived from business data, usually in the form of an inscrutable mass of binary truth values that is only actionable to the algorithm.  That is what gets sent to the cloud, not the customer file.

If you want to do advanced cloud-scale Big Data analytics and somebody is telling you it is against the rules, you should look at the rules.  They probably don't even apply to what you are trying to do.

First User Advantage
Advanced Big Data analytics is sufficiently new and difficult that not many companies are doing much of it yet.  But where BI helps you run a tighter ship, Big Data analytics helps you sink your enemy's fleet.

Some day, technologies like high performance statistical machine learning will be ubiquitous and the business winners will be the ones who uses the software best.  But right now, solutions are still scarce and the business winners are ones willing to use the software at all.

More Stories By Tim Negris

Tim Negris is SVP, Marketing & Sales at Yottamine Analytics, a pioneering Big Data machine learning software company. He occasionally authors software industry news analysis and insights on Ulitzer.com, is a 25-year technology industry veteran with expertise in software development, database, networking, social media, cloud computing, mobile apps, analytics, and other enabling technologies.

He is recognized for ability to rapidly translate complex technical information and concepts into compelling, actionable knowledge. He is also widely credited with coining the term and co-developing the concept of the “Thin Client” computing model while working for Larry Ellison in the early days of Oracle.

Tim has also held a variety of executive and consulting roles in a numerous start-ups, and several established companies, including Sybase, Oracle, HP, Dell, and IBM. He is a frequent contributor to a number of publications and sites, focusing on technologies and their applications, and has written a number of advanced software applications for social media, video streaming, and music education.

@CloudExpo Stories
Fifty billion connected devices and still no winning protocols standards. HTTP, WebSockets, MQTT, and CoAP seem to be leading in the IoT protocol race at the moment but many more protocols are getting introduced on a regular basis. Each protocol has its pros and cons depending on the nature of the communications. Does there really need to be only one protocol to rule them all? Of course not. In his session at @ThingsExpo, Chris Matthieu, co-founder and CTO of Octoblu, walk you through how Oct...
SYS-CON Events announced today that Embotics, the cloud automation company, will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Embotics is the cloud automation company for IT organizations and service providers that need to improve provisioning or enable self-service capabilities. With a relentless focus on delivering a premier user experience and unmatched customer support, Embotics is the fas...
More and more brands have jumped on the IoT bandwagon. We have an excess of wearables – activity trackers, smartwatches, smart glasses and sneakers, and more that track seemingly endless datapoints. However, most consumers have no idea what “IoT” means. Creating more wearables that track data shouldn't be the aim of brands; delivering meaningful, tangible relevance to their users should be. We're in a period in which the IoT pendulum is still swinging. Initially, it swung toward "smart for smar...
@ThingsExpo has been named the Top 5 Most Influential M2M Brand by Onalytica in the ‘Machine to Machine: Top 100 Influencers and Brands.' Onalytica analyzed the online debate on M2M by looking at over 85,000 tweets to provide the most influential individuals and brands that drive the discussion. According to Onalytica the "analysis showed a very engaged community with a lot of interactive tweets. The M2M discussion seems to be more fragmented and driven by some of the major brands present in the...
Traditional on-premises data centers have long been the domain of modern data platforms like Apache Hadoop, meaning companies who build their business on public cloud were challenged to run Big Data processing and analytics at scale. But recent advancements in Hadoop performance, security, and most importantly cloud-native integrations, are giving organizations the ability to truly gain value from all their data. In his session at 19th Cloud Expo, David Tishgart, Director of Product Marketing ...
SYS-CON Events announced today that Pulzze Systems will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Pulzze Systems, Inc. provides infrastructure products for the Internet of Things to enable any connected device and system to carry out matched operations without programming. For more information, visit http://www.pulzzesystems.com.
The Quantified Economy represents the total global addressable market (TAM) for IoT that, according to a recent IDC report, will grow to an unprecedented $1.3 trillion by 2019. With this the third wave of the Internet-global proliferation of connected devices, appliances and sensors is poised to take off in 2016. In his session at @ThingsExpo, David McLauchlan, CEO and co-founder of Buddy Platform, discussed how the ability to access and analyze the massive volume of streaming data from millio...
SYS-CON Events announced today that Interface Masters Technologies, a leader in Network Visibility and Uptime Solutions, will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Interface Masters Technologies is a leading vendor in the network monitoring and high speed networking markets. Based in the heart of Silicon Valley, Interface Masters' expertise lies in Gigabit, 10 Gigabit and 40 Gigabit Eth...
Successful digital transformation requires new organizational competencies and capabilities. Research tells us that the biggest impediment to successful transformation is human; consequently, the biggest enabler is a properly skilled and empowered workforce. In the digital age, new individual and collective competencies are required. In his session at 19th Cloud Expo, Bob Newhouse, CEO and founder of Agilitiv, will draw together recent research and lessons learned from emerging and established ...
Enterprise IT has been in the era of Hybrid Cloud for some time now. But it seems most conversations about Hybrid are focused on integrating AWS, Microsoft Azure, or Google ECM into existing on-premises systems. Where is all the Private Cloud? What do technology providers need to do to make their offerings more compelling? How should enterprise IT executives and buyers define their focus, needs, and roadmap, and communicate that clearly to the providers?
As software becomes more and more complex, we, as software developers, have been splitting up our code into smaller and smaller components. This is also true for the environment in which we run our code: going from bare metal, to VMs to the modern-day Cloud Native world of containers, schedulers and microservices. While we have figured out how to run containerized applications in the cloud using schedulers, we've yet to come up with a good solution to bridge the gap between getting your conta...
One of biggest questions about Big Data is “How do we harness all that information for business use quickly and effectively?” Geographic Information Systems (GIS) or spatial technology is about more than making maps, but adding critical context and meaning to data of all types, coming from all different channels – even sensors. In his session at @ThingsExpo, William (Bill) Meehan, director of utility solutions for Esri, will take a closer look at the current state of spatial technology and ar...
SYS-CON Events announced today that Streamlyzer will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. Streamlyzer is a powerful analytics for video streaming service that enables video streaming providers to monitor and analyze QoE (Quality-of-Experience) from end-user devices in real time.
You have great SaaS business app ideas. You want to turn your idea quickly into a functional and engaging proof of concept. You need to be able to modify it to meet customers' needs, and you need to deliver a complete and secure SaaS application. How could you achieve all the above and yet avoid unforeseen IT requirements that add unnecessary cost and complexity? You also want your app to be responsive in any device at any time. In his session at 19th Cloud Expo, Mark Allen, General Manager of...
SYS-CON Media announced today that @WebRTCSummit Blog, the largest WebRTC resource in the world, has been launched. @WebRTCSummit Blog offers top articles, news stories, and blog posts from the world's well-known experts and guarantees better exposure for its authors than any other publication. @WebRTCSummit Blog can be bookmarked ▸ Here @WebRTCSummit conference site can be bookmarked ▸ Here
Cloud based infrastructure deployment is becoming more and more appealing to customers, from Fortune 500 companies to SMEs due to its pay-as-you-go model. Enterprise storage vendors are able to reach out to these customers by integrating in cloud based deployments; this needs adaptability and interoperability of the products confirming to cloud standards such as OpenStack, CloudStack, or Azure. As compared to off the shelf commodity storage, enterprise storages by its reliability, high-availabil...
The IoT industry is now at a crossroads, between the fast-paced innovation of technologies and the pending mass adoption by global enterprises. The complexity of combining rapidly evolving technologies and the need to establish practices for market acceleration pose a strong challenge to global enterprises as well as IoT vendors. In his session at @ThingsExpo, Clark Smith, senior product manager for Numerex, will discuss how Numerex, as an experienced, established IoT provider, has embraced a ...
DevOps is being widely accepted (if not fully adopted) as essential in enterprise IT. But as Enterprise DevOps gains maturity, expands scope, and increases velocity, the need for data-driven decisions across teams becomes more acute. DevOps teams in any modern business must wrangle the ‘digital exhaust’ from the delivery toolchain, "pervasive" and "cognitive" computing, APIs and services, mobile devices and applications, the Internet of Things, and now even blockchain. In this power panel at @...
So you think you are a DevOps warrior, huh? Put your money (not really, it’s free) where your metrics are and prove it by taking The Ultimate DevOps Geek Quiz Challenge, sponsored by DevOps Summit. Battle through the set of tough questions created by industry thought leaders to earn your bragging rights and win some cool prizes.
SYS-CON Events announced today that SoftNet Solutions will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. SoftNet Solutions specializes in Enterprise Solutions for Hadoop and Big Data. It offers customers the most open, robust, and value-conscious portfolio of solutions, services, and tools for the shortest route to success with Big Data. The unique differentiator is the ability to architect and ...