SDN Journal Authors: Liz McMillan, Yeshim Deniz, Elizabeth White, Pat Romanski, TJ Randall

Related Topics: @DXWorldExpo, Containers Expo Blog, SDN Journal

@DXWorldExpo: Blog Post

Pay Attention to Big Data By @MartenT1999 | @BigDataExpo [#BigData]

Big Data Application folks have a pretty good understanding of the role of the network

Network Engineers, Pay Attention to Big Data

You have probably realized we are having a Big Data kind of week here at the Plexxi blog. And for good reason. The amount of development and change in this big bucket of applications we conveniently label “Big Data”, is astonishing.

Walking around at Hadoopworld in New York last week, I initially felt somewhat lost as a “networking guy”. But that feeling of “not belonging” is only superficial, the network has a tremendously important role in these applications. The challenge is that many “networking” folks don’t quite understand or realize that yet, but contrary to what I believed not too long ago, Big Data Application folks have a pretty good understanding of the role of the network in their overall application and its performance.

As an industry we have been talking about the increase in east-west traffic for quite a few years now. For your typical datacenter infrastructure today this is based on loosely coupled applications and semi-distributed storage. A web based application has many components that together make up the application we see as users. There are application load balancers, web server front ends, application back ends that in turn have databases for their data storage. And those databases may have local or more likely centralized or semi-distributed physical storage. Then these storage systems have replication and backup components. All of these interactions we have traditionally labeled east-west, this is all traffic inside the datacenter required to pass the appropriate data back to the application user. Whether that is a person or another application.

The communication patterns in more traditional distributed applications like these are fairly straightforward to understand. Some basic measurements and profiling should give you a pretty decent view of how each of the components of the application behaves, how they interact and what the network requirements are between them. The application developers may not necessarily be able to provide you with exact needs and guidance before a deployment, but the applications, once they have gone through at least one scaling and performance adjustment cycle, will typically fall into a specific pattern that will be fairly consistent for the life of the application. And the job of the network engineer is to ensure that the network provides appropriate connectivity for these communication patterns.

Big Data applications bring these east-west concepts to new levels. They are designed to run in a parallel or distributed system. They depend on moving extremely large amounts of data through the compute infrastructure. They are built with the assumption that data and computation is continually distributed and replicated across members of a big data cluster. Many of these applications are built to tackle a multitude of different data analysis jobs. Each of them different in its data set, its data reduction behaviors and therefore different in what it wants or needs from the network. For that, you need a much more dynamic network than the ones your have built in the past.

Many Big Data deployments today are built on top of 1GbE networks. It is easy to draw the conclusion that therefore the network is not an issue. And it’s probably the biggest mistake to make. It is easy to think of Big Data projects as compute intensive analysis and reductions of extremely large amounts of data. In reality, many big data applications are semi-real time streaming data based. Each piece of data may require only a fairly small amount of computation, but the sheer amount of data requires new levels of connectivity we may not be used to.

Last week at Hadoopworld I had an interesting chat with someone that worked for an Ad Tech company. Ad Tech is a fast growing sub industry in marketing and advertising, focused on digital advertising and marketing. This gentleman asked us about some of the performance characteristics of a Plexxi network. When we asked him for some more details of his deployment, he explained that he manages a big data cluster of about 200 servers and that with the switches he uses today (from one of our competitors and certainly at the top end of expected performance), he can only populate about half of the switch’s available port before congestion becomes an issue. His cluster fairly consistently pushes 700 Gbit to 1 Tbit per second across the racks. There are very very few network infrastructures out there that are specifically designed and built to support those types of applications.

(I can already hear someone say: “why wouldn’t he use a big chassis based switch in the middle of the network for 200 devices”. If as a network industry we believe our answer is to create ever larger centralized switches then we have not learned from all the industries around us in that same datacenter).

The answer is also not the often heard “just throw more bandwidth at the problem”. This Ad Tech company is a prime example of how we need to evolve our thinking in how we support these new applications. We have to stop pretending to be able to support new application infrastructures with new demands, needs and requirements with the same network we have been building for years. The applications are evolving. Servers, storage and how they are being used is evolving. Network engineers need to dive into this whole new world of Big Data Applications. It’s scary, there are many new acronyms and names you will not recognize. If you thought the network world was creative in naming of things, the application folks have us beat, hands down.

Don’t be afraid of these new applications. They are coming whether you like it or not. Embrace them, understand them as best you can. Then sit back and think about what the network can do for them. You have an ability to significantly impact their ability to perform. But you may have to put traditional thinking aside and step out of the box that has provided so much comfort for so many years. It will be worth it.


[Today's fun fact: The first penny had the motto "Mind your business." Can we bring those pennies back please?]

The post Network Engineers, Pay Attention to Big Data appeared first on Plexxi.

Read the original blog entry...

More Stories By Marten Terpstra

Marten Terpstra is a Product Management Director at Plexxi Inc. Marten has extensive knowledge of the architecture, design, deployment and management of enterprise and carrier networks.

CloudEXPO Stories
Despite being the market leader, we recognized the need to transform and reinvent our business at Dynatrace, before someone else disrupted the market. Over the course of three years, we changed everything - our technology, our culture and our brand image. In this session we'll discuss how we navigated through our own innovator's dilemma, and share takeaways from our experience that you can apply to your own organization.
DXWorldEXPO LLC announced today that Nutanix has been named "Platinum Sponsor" of CloudEXPO | DevOpsSUMMIT | DXWorldEXPO New York, which will take place November 12-13, 2018 in New York City. Nutanix makes infrastructure invisible, elevating IT to focus on the applications and services that power their business. The Nutanix Enterprise Cloud Platform blends web-scale engineering and consumer-grade design to natively converge server, storage, virtualization and networking into a resilient, software-defined solution with rich machine intelligence.
Founded in 2002 and headquartered in Chicago, Nexum® takes a comprehensive approach to security. Nexum approaches business with one simple statement: “Do what’s right for the customer and success will follow.” Nexum helps you mitigate risks, protect your data, increase business continuity and meet your unique business objectives by: Detecting and preventing network threats, intrusions and disruptions Equipping you with the information, tools, training and resources you need to effectively manage IT risk Nexum, Latin for an arrangement by which one pledged one’s very liberty as security, Nexum is committed to ensuring your security. At Nexum, We Mean Security®.
Having been in the web hosting industry since 2002, dhosting has gained a great deal of experience while working on a wide range of projects. This experience has enabled the company to develop our amazing new product, which they are now excited to present! Among dHosting's greatest achievements, they can include the development of their own hosting panel, the building of their fully redundant server system, and the creation of dhHosting's unique product, Dynamic Edge.
The Transparent Cloud-computing Consortium (T-Cloud) is a neutral organization for researching new computing models and business opportunities in IoT era. In his session, Ikuo Nakagawa, Co-Founder and Board Member at Transparent Cloud Computing Consortium, will introduce the big change toward the "connected-economy" in the digital age. He'll introduce and describe some leading-edge business cases from his original points of view, and discuss models & strategies in the connected-economy. Nowadays, "digital innovation" is a big wave of business transformation based on digital technologies. IoT, Big Data, AI, FinTech and various leading-edge technologies are key components of such business drivers.