Welcome!

SDN Journal Authors: Liz McMillan, Elizabeth White, Yeshim Deniz, Pat Romanski, TJ Randall

Related Topics: @DXWorldExpo, Containers Expo Blog, SDN Journal

@DXWorldExpo: Blog Post

Pay Attention to Big Data By @MartenT1999 | @BigDataExpo [#BigData]

Big Data Application folks have a pretty good understanding of the role of the network

Network Engineers, Pay Attention to Big Data

You have probably realized we are having a Big Data kind of week here at the Plexxi blog. And for good reason. The amount of development and change in this big bucket of applications we conveniently label “Big Data”, is astonishing.

Walking around at Hadoopworld in New York last week, I initially felt somewhat lost as a “networking guy”. But that feeling of “not belonging” is only superficial, the network has a tremendously important role in these applications. The challenge is that many “networking” folks don’t quite understand or realize that yet, but contrary to what I believed not too long ago, Big Data Application folks have a pretty good understanding of the role of the network in their overall application and its performance.

As an industry we have been talking about the increase in east-west traffic for quite a few years now. For your typical datacenter infrastructure today this is based on loosely coupled applications and semi-distributed storage. A web based application has many components that together make up the application we see as users. There are application load balancers, web server front ends, application back ends that in turn have databases for their data storage. And those databases may have local or more likely centralized or semi-distributed physical storage. Then these storage systems have replication and backup components. All of these interactions we have traditionally labeled east-west, this is all traffic inside the datacenter required to pass the appropriate data back to the application user. Whether that is a person or another application.

The communication patterns in more traditional distributed applications like these are fairly straightforward to understand. Some basic measurements and profiling should give you a pretty decent view of how each of the components of the application behaves, how they interact and what the network requirements are between them. The application developers may not necessarily be able to provide you with exact needs and guidance before a deployment, but the applications, once they have gone through at least one scaling and performance adjustment cycle, will typically fall into a specific pattern that will be fairly consistent for the life of the application. And the job of the network engineer is to ensure that the network provides appropriate connectivity for these communication patterns.

Big Data applications bring these east-west concepts to new levels. They are designed to run in a parallel or distributed system. They depend on moving extremely large amounts of data through the compute infrastructure. They are built with the assumption that data and computation is continually distributed and replicated across members of a big data cluster. Many of these applications are built to tackle a multitude of different data analysis jobs. Each of them different in its data set, its data reduction behaviors and therefore different in what it wants or needs from the network. For that, you need a much more dynamic network than the ones your have built in the past.

Many Big Data deployments today are built on top of 1GbE networks. It is easy to draw the conclusion that therefore the network is not an issue. And it’s probably the biggest mistake to make. It is easy to think of Big Data projects as compute intensive analysis and reductions of extremely large amounts of data. In reality, many big data applications are semi-real time streaming data based. Each piece of data may require only a fairly small amount of computation, but the sheer amount of data requires new levels of connectivity we may not be used to.

Last week at Hadoopworld I had an interesting chat with someone that worked for an Ad Tech company. Ad Tech is a fast growing sub industry in marketing and advertising, focused on digital advertising and marketing. This gentleman asked us about some of the performance characteristics of a Plexxi network. When we asked him for some more details of his deployment, he explained that he manages a big data cluster of about 200 servers and that with the switches he uses today (from one of our competitors and certainly at the top end of expected performance), he can only populate about half of the switch’s available port before congestion becomes an issue. His cluster fairly consistently pushes 700 Gbit to 1 Tbit per second across the racks. There are very very few network infrastructures out there that are specifically designed and built to support those types of applications.

(I can already hear someone say: “why wouldn’t he use a big chassis based switch in the middle of the network for 200 devices”. If as a network industry we believe our answer is to create ever larger centralized switches then we have not learned from all the industries around us in that same datacenter).

The answer is also not the often heard “just throw more bandwidth at the problem”. This Ad Tech company is a prime example of how we need to evolve our thinking in how we support these new applications. We have to stop pretending to be able to support new application infrastructures with new demands, needs and requirements with the same network we have been building for years. The applications are evolving. Servers, storage and how they are being used is evolving. Network engineers need to dive into this whole new world of Big Data Applications. It’s scary, there are many new acronyms and names you will not recognize. If you thought the network world was creative in naming of things, the application folks have us beat, hands down.

Don’t be afraid of these new applications. They are coming whether you like it or not. Embrace them, understand them as best you can. Then sit back and think about what the network can do for them. You have an ability to significantly impact their ability to perform. But you may have to put traditional thinking aside and step out of the box that has provided so much comfort for so many years. It will be worth it.

 

[Today's fun fact: The first penny had the motto "Mind your business." Can we bring those pennies back please?]

The post Network Engineers, Pay Attention to Big Data appeared first on Plexxi.

Read the original blog entry...

More Stories By Marten Terpstra

Marten Terpstra is a Product Management Director at Plexxi Inc. Marten has extensive knowledge of the architecture, design, deployment and management of enterprise and carrier networks.

@CloudExpo Stories
In his keynote at 19th Cloud Expo, Sheng Liang, co-founder and CEO of Rancher Labs, discussed the technological advances and new business opportunities created by the rapid adoption of containers. With the success of Amazon Web Services (AWS) and various open source technologies used to build private clouds, cloud computing has become an essential component of IT strategy. However, users continue to face challenges in implementing clouds, as older technologies evolve and newer ones like Docker c...
HyperConvergence came to market with the objective of being simple, flexible and to help drive down operating expenses. It reduced the footprint by bundling the compute/storage/network into one box. This brought a new set of challenges as the HyperConverged vendors are very focused on their own proprietary building blocks. If you want to scale in a certain way, let's say you identified a need for more storage and want to add a device that is not sold by the HyperConverged vendor, forget about it...
"MobiDev is a software development company and we do complex, custom software development for everybody from entrepreneurs to large enterprises," explained Alan Winters, U.S. Head of Business Development at MobiDev, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
22nd International Cloud Expo, taking place June 5-7, 2018, at the Javits Center in New York City, NY, and co-located with the 1st DXWorld Expo will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud ...
The next XaaS is CICDaaS. Why? Because CICD saves developers a huge amount of time. CD is an especially great option for projects that require multiple and frequent contributions to be integrated. But… securing CICD best practices is an emerging, essential, yet little understood practice for DevOps teams and their Cloud Service Providers. The only way to get CICD to work in a highly secure environment takes collaboration, patience and persistence. Building CICD in the cloud requires rigorous ar...
@DevOpsSummit at Cloud Expo, taking place November 12-13 in New York City, NY, is co-located with 22nd international CloudEXPO | first international DXWorldEXPO and will feature technical sessions from a rock star conference faculty and the leading industry players in the world.
"We're focused on how to get some of the attributes that you would expect from an Amazon, Azure, Google, and doing that on-prem. We believe today that you can actually get those types of things done with certain architectures available in the market today," explained Steve Conner, VP of Sales at Cloudistics, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
Sanjeev Sharma Joins November 11-13, 2018 @DevOpsSummit at @CloudEXPO New York Faculty. Sanjeev Sharma is an internationally known DevOps and Cloud Transformation thought leader, technology executive, and author. Sanjeev's industry experience includes tenures as CTO, Technical Sales leader, and Cloud Architect leader. As an IBM Distinguished Engineer, Sanjeev is recognized at the highest levels of IBM's core of technical leaders.
As Cybric's Chief Technology Officer, Mike D. Kail is responsible for the strategic vision and technical direction of the platform. Prior to founding Cybric, Mike was Yahoo's CIO and SVP of Infrastructure, where he led the IT and Data Center functions for the company. He has more than 24 years of IT Operations experience with a focus on highly-scalable architectures.
JETRO showcased Japan Digital Transformation Pavilion at SYS-CON's 21st International Cloud Expo® at the Santa Clara Convention Center in Santa Clara, CA. The Japan External Trade Organization (JETRO) is a non-profit organization that provides business support services to companies expanding to Japan. With the support of JETRO's dedicated staff, clients can incorporate their business; receive visa, immigration, and HR support; find dedicated office space; identify local government subsidies; get...
Dion Hinchcliffe is an internationally recognized digital expert, bestselling book author, frequent keynote speaker, analyst, futurist, and transformation expert based in Washington, DC. He is currently Chief Strategy Officer at the industry-leading digital strategy and online community solutions firm, 7Summits.
Bill Schmarzo, author of "Big Data: Understanding How Data Powers Big Business" and "Big Data MBA: Driving Business Strategies with Data Science," is responsible for setting the strategy and defining the Big Data service offerings and capabilities for EMC Global Services Big Data Practice. As the CTO for the Big Data Practice, he is responsible for working with organizations to help them identify where and how to start their big data journeys. He's written several white papers, is an avid blogge...
DXWorldEXPO LLC announced today that Dez Blanchfield joined the faculty of CloudEXPO's "10-Year Anniversary Event" which will take place on November 11-13, 2018 in New York City. Dez is a strategic leader in business and digital transformation with 25 years of experience in the IT and telecommunications industries developing strategies and implementing business initiatives. He has a breadth of expertise spanning technologies such as cloud computing, big data and analytics, cognitive computing, m...
In past @ThingsExpo presentations, Joseph di Paolantonio has explored how various Internet of Things (IoT) and data management and analytics (DMA) solution spaces will come together as sensor analytics ecosystems. This year, in his session at @ThingsExpo, Joseph di Paolantonio from DataArchon, added the numerous Transportation areas, from autonomous vehicles to “Uber for containers.” While IoT data in any one area of Transportation will have a huge impact in that area, combining sensor analytic...
Bill Schmarzo, author of "Big Data: Understanding How Data Powers Big Business" and "Big Data MBA: Driving Business Strategies with Data Science," is responsible for setting the strategy and defining the Big Data service offerings and capabilities for EMC Global Services Big Data Practice. As the CTO for the Big Data Practice, he is responsible for working with organizations to help them identify where and how to start their big data journeys. He's written several white papers, is an avid blogge...
Charles Araujo is an industry analyst, internationally recognized authority on the Digital Enterprise and author of The Quantum Age of IT: Why Everything You Know About IT is About to Change. As Principal Analyst with Intellyx, he writes, speaks and advises organizations on how to navigate through this time of disruption. He is also the founder of The Institute for Digital Transformation and a sought after keynote speaker. He has been a regular contributor to both InformationWeek and CIO Insight...
Michael Maximilien, better known as max or Dr. Max, is a computer scientist with IBM. At IBM Research Triangle Park, he was a principal engineer for the worldwide industry point-of-sale standard: JavaPOS. At IBM Research, some highlights include pioneering research on semantic Web services, mashups, and cloud computing, and platform-as-a-service. He joined the IBM Cloud Labs in 2014 and works closely with Pivotal Inc., to help make the Cloud Found the best PaaS.
It is of utmost importance for the future success of WebRTC to ensure that interoperability is operational between web browsers and any WebRTC-compliant client. To be guaranteed as operational and effective, interoperability must be tested extensively by establishing WebRTC data and media connections between different web browsers running on different devices and operating systems. In his session at WebRTC Summit at @ThingsExpo, Dr. Alex Gouaillard, CEO and Founder of CoSMo Software, presented ...
In a world where the internet rules all, where 94% of business buyers conduct online research, and where e-commerce sales are poised to fall between $427 billion and $443 billion by the end of this year, we think it's safe to say that your website is a vital part of your business strategy. Whether you're a B2B company, a local business, or an e-commerce site, digital presence is key to maintain in your drive towards success. Digital Performance will take priority in 2018 for the following reason...
I think DevOps is now a rambunctious teenager - it's starting to get a mind of its own, wanting to get its own things but it still needs some adult supervision," explained Thomas Hooker, VP of marketing at CollabNet, in this SYS-CON.tv interview at DevOps Summit at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
What's the role of an IT self-service portal when you get to continuous delivery and Infrastructure as Code? This general session showed how to create the continuous delivery culture and eight accelerators for leading the change. Don Demcsak is a DevOps and Cloud Native Modernization Principal for Dell EMC based out of New Jersey. He is a former, long time, Microsoft Most Valuable Professional, specializing in building and architecting Application Delivery Pipelines for hybrid legacy, and cloud ...