Click here to close now.

Welcome!

SDN Journal Authors: Pat Romanski, Carmen Gonzalez, Liz McMillan, Elizabeth White, Yeshim Deniz

Related Topics: SDN Journal, Java, Microservices Journal, Linux, Virtualization

SDN Journal: Blog Feed Post

SDN's Eventually Consistent Network Problem

Clustering controllers to address scalability concerns introduces a well-understood problem: consistency

One of the benefits of SDN is centralized control. That is, there is a single repository containing the known current state of the entire network. It is this centralization that enables intelligent application of new policies to govern and control the network - from new routes to user experience services like QoS. Because there is a single entity which has visibility into the state of the network as a whole, it can examine the topology at any given point and make determinations as to where this packet and that should be routed, how it is prioritized and even whether or not it is allowed to traverse the network.

It's a pretty powerful concept for networks, which traditionally distribute network state as individual configuration files across the data path.

network-state-traditional

Most of the focus of SDN is on the replacement of manual and scripted configuration methods with an API-driven mechanism. Whether that's OpenFlow or OpFlex or some other protocol is not really important as the benefit of operationalization is to provide a consistent interface from the perspective of the operator, not the device.

network-state-sdn

This is a real benefit; operationalization across operations and dev has proven to produce tangible benefits in the form of improved time to market and a reduction in errors. By centralizing network state in a controller, this model provides a comprehensive view of the network at any given moment. Because the controller is not just a repository but an active participant in the flow of data across the network, this visibility enables the controller to understand how to (ostensibly) non-disruptively change routes or apply new policies in real-time.

The benefit itself is not in question. What is in question is what happens when the controller of this new software-defined architecture becomes overwhelmed, and how to preserve that benefit when the centralized model must decentralize in order to scale.

The Eventually Consistent Problem Comes to the Network

Eventual consistency is nothing new. It has always been an issue when scaling applications, particularly those that rely on shared data. Consider Amazon, if you will. If you and I are both shopping for the same thing, and I order before you, it may take seconds or more before the database is updated. If you were in the middle of ordering at the same time, you and I may be contending for the same item. Because my order takes a moment or two to propagate through the system, your view of the database (the availability of the item) is inconsistent with mine.

It is assumed that eventually our views will be consistent, and that this age old unsolved problem of distributed computing simply must be accepted as unsolvable for now,  Thus systems are designed with this principle in mind. Which means we end up back with Brewer's CAP Theorem staring us in the face and reminding us we can't be perfectly consistent in a distributed system, so we must deal with systems in such a way as to achieve eventually consistency.

At issue is the ability of a software controller to scale. The controller is, by design and necessity, part of the data path. That is both a blessing and a curse. It is from this fact that the real-time adaption of network behavior can be achieved, but it is also this fact which forces issues of scale and introduces the need for a distributed system from which the problem of eventual consistency derives. That's because more than one system will be the "master" repository for a given portion of network state. Even if one controller is designated as master of the network universe and thus maintains the "official" state of the network, there are those moments when the secondary (or tertiary) controller has modified the "official" state and introduces inconsistency. In the moments between when the two network states merge, there is the possibility that the first (master) controller will also try to make a decision based on information that relies on network state that is no longer valid. If Controller B, for example, removes a port from a VLAN, and before that state can propagate to the master, a packet arrives in the fabric, destined for that port, Controller A will have no way to know that it is no longer participating in the VLAN and will, as expected, tell the switch to route to that port.

The issue will be shortly resolved, assuming timely synchronization of network state across the cluster, but in the meantime performance (or availability) may be negatively impacted.

clustered-sdn

The problem with eventual consistency in the network is one of magnitude. Eventually consistent views of books in stock at Amazon has a very different impact than an eventually consistent view of the network underpinning today's applications and ultimately the business. We're not talking about losing out on a book, we're talking about potentially disrupting hundreds or thousands of applications that translates into hundreds of thousands or even millions of dollars. Ponemon's 2013 Cost of Data Center Outages proves this case out: "The average reported outage incident length was 86 minutes, resulting in average cost per incident of about $690,200."

Eventual consistency of the network may turn out to be quite costly.

Common Themes: Reliability and Control

This is not a new problem. This issue of stateful failover as applied to scalability of both infrastructure and applications is one that application delivery has been dealing with, well, for over a decade now. The issue when dealing with distributed state is always one of replication and synchronization between those devices providing for reliability. That doesn't change just because we move from one form factor to another, or from on-premise to cloud. The issue remains: how do we maintain an authoritative view of the state of an <application or network> while still enabling the scale necessary to meet demand?

While we (as in the industry "we") recognize that true stateful reliability - and thus perfect consistency - is currently unachievable due to the constraints of distributed system design, we also recognize that we can get pretty darn close. From an application perspective, the intelligence embedded in a service fabric is more than able to deal with the problem with minimal introduction of latency. That is, there will be a slight pause and some disruption when failure or disruption occurs in the network but if the service fabric is smart enough, the disruption is experienced by the end user as no more than a slight hiccup - likely unnoticeable.

But the further down the stack you go, toward core network function, the more disruptive such a hiccup is going to be.

That's one of the reasons a "centralized control, decentralized execution" architecture makes more sense from a network perspective. Such a model maintains authoritative control over the state of the network, but empowers individual components in the various fabrics (stateless L2-4 and stateful L4-7) that make up "the network" to maintain its own prescriptive configuration and take action when necessary based on the abstracted policies of the network as a whole.

Everyone likes to posit an answer to what will be the "killer app" for SDN. But before we can worry about that, we might want to consider what may be the "showstopper" obstacles for SDN. Eventual consistency when scaling controllers is one of those issues.

Because without a reliable and consistent network world, there is no application world. Or at least not one that users will be excited to rely on.

Read the original blog entry...

More Stories By Lori MacVittie

Lori MacVittie is responsible for education and evangelism of application services available across F5’s entire product suite. Her role includes authorship of technical materials and participation in a number of community-based forums and industry standards organizations, among other efforts. MacVittie has extensive programming experience as an application architect, as well as network and systems development and administration expertise. Prior to joining F5, MacVittie was an award-winning Senior Technology Editor at Network Computing Magazine, where she conducted product research and evaluation focused on integration with application and network architectures, and authored articles on a variety of topics aimed at IT professionals. Her most recent area of focus included SOA-related products and architectures. She holds a B.S. in Information and Computing Science from the University of Wisconsin at Green Bay, and an M.S. in Computer Science from Nova Southeastern University.

@CloudExpo Stories
Even though it’s now Microservices Journal, long-time fans of SOA World Magazine can take comfort in the fact that the URL – soa.sys-con.com – remains unchanged. And that’s no mistake, as microservices are really nothing more than a new and improved take on the Service-Oriented Architecture (SOA) best practices we struggled to hammer out over the last decade. Skeptics, however, might say that this change is nothing more than an exercise in buzzword-hopping. SOA is passé, and now that people are ...
The Domain Name Service (DNS) is one of the most important components in networking infrastructure, enabling users and services to access applications by translating URLs (names) into IP addresses (numbers). Because every icon and URL and all embedded content on a website requires a DNS lookup loading complex sites necessitates hundreds of DNS queries. In addition, as more internet-enabled ‘Things' get connected, people will rely on DNS to name and find their fridges, toasters and toilets. Acco...
Gartner predicts that the bulk of new IT spending by 2016 will be for cloud platforms and applications and that nearly half of large enterprises will have cloud deployments by the end of 2017. The benefits of the cloud may be clear for applications that can tolerate brief periods of downtime, but for critical applications like SQL Server, Oracle and SAP, companies need a strategy for HA and DR protection. While traditional SAN-based clusters are not possible in these environments, SANless cluste...
You often hear the two titles of "DevOps" and "Immutable Infrastructure" used independently. In his session at DevOps Summit, John Willis, Technical Evangelist for Docker, will cover the union between the two topics and why this is important. He will cover an overview of Immutable Infrastructure then show how an Immutable Continuous Delivery pipeline can be applied as a best practice for "DevOps." He will end the session with some interesting case study examples.
SYS-CON Media named Andi Mann editor of DevOps Journal. DevOps Journal is focused on this critical enterprise IT topic in the world of cloud computing. DevOps Journal brings valuable information to DevOps professionals who are transforming the way enterprise IT is done. Andi Mann, Vice President, Strategic Solutions, at CA Technologies, is an accomplished digital business executive with extensive global expertise as a strategist, technologist, innovator, marketer, communicator, and thought lea...
There is little doubt that Big Data solutions will have an increasing role in the Enterprise IT mainstream over time. 8th International Big Data Expo, co-located with 17th International Cloud Expo - to be held November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA - has announced its Call for Papers is open. As advanced data storage, access and analytics technologies aimed at handling high-volume and/or fast moving data all move center stage, aided by the cloud computing bo...
Enterprises are fast realizing the importance of integrating SaaS/Cloud applications, API and on-premises data and processes, to unleash hidden value. This webinar explores how managers can use a Microservice-centric approach to aggressively tackle the unexpected new integration challenges posed by proliferation of cloud, mobile, social and big data projects. Industry analyst and SOA expert Jason Bloomberg will strip away the hype from microservices, and clearly identify their advantages and d...
The time is ripe for high speed resilient software defined storage solutions with unlimited scalability. ISS has been working with the leading open source projects and developed a commercial high performance solution that is able to grow forever without performance limitations. In his session at DevOps Summit, Alex Gorbachev, President of Intelligent Systems Services Inc., will share foundation principles of Ceph architecture, as well as the design to deliver this storage to traditional SAN st...
The 5th International DevOps Summit, co-located with 17th International Cloud Expo – being held November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA – announces that its Call for Papers is open. Born out of proven success in agile development, cloud computing, and process automation, DevOps is a macro trend you cannot afford to miss. From showcase success stories from early adopters and web-scale businesses, DevOps is expanding to organizations of all sizes, including the...
The Internet of Things promises to transform businesses (and lives), but navigating the business and technical path to success can be difficult to understand. In his session at @ThingsExpo, Sean Lorenz, Technical Product Manager for Xively at LogMeIn, demonstrated how to approach creating broadly successful connected customer solutions using real world business transformation studies including New England BioLabs and more.
Since 2008 and for the first time in history, more than half of humans live in urban areas, urging cities to become “smart.” Today, cities can leverage the wide availability of smartphones combined with new technologies such as Beacons or NFC to connect their urban furniture and environment to create citizen-first services that improve transportation, way-finding and information delivery. In her session at @ThingsExpo, Laetitia Gazel-Anthoine, CEO of Connecthings, will focus on successful use c...
In their general session at 16th Cloud Expo, Michael Piccininni, Global Account Manager – Cloud SP at EMC Corporation, and Mike Dietze, Regional Director at Windstream Hosted Solutions, will review next generation cloud services, including the Windstream-EMC Tier Storage solutions, and discuss how to increase efficiencies, improve service delivery and enhance corporate cloud solution development. Speaker Bios Michael Piccininni is Global Account Manager – Cloud SP at EMC Corporation. He has b...
There are 182 billion emails sent every day, generating a lot of data about how recipients and ISPs respond. Many marketers take a more-is-better approach to stats, preferring to have the ability to slice and dice their email lists based numerous arbitrary stats. However, fundamentally what really matters is whether or not sending an email to a particular recipient will generate value. Data Scientists can design high-level insights such as engagement prediction models and content clusters that a...
Big Data is amazing, it's life changing and yes it is changing how we see our world. Big Data, however, can sometimes be too big. Organizations that are not amassing massive amounts of information and feeding into their decision buckets, smaller data that feeds in from customer buying patterns, buying decisions and buying influences can be more useful when used in the right way. In their session at Big Data Expo, Ermanno Bonifazi, CEO & Founder of Solgenia, and Ian Khan, Global Strategic Positi...
DevOps Summit, taking place Nov 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 17th Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The widespread success of cloud computing is driving the DevOps revolution in enterprise IT. Now as never before, development teams must communicate and collaborate in a dynamic, 24/7/365 environment. There is no time to wait for long developmen...
It's time to face reality: "Americans are from Mars, Europeans are from Venus," and in today's increasingly connected world, understanding "inter-planetary" alignments and deviations is mission-critical for cloud. In her session at 15th Cloud Expo, Evelyn de Souza, Data Privacy and Compliance Strategy Leader at Cisco Systems, discussed cultural expectations of privacy based on new research across these elements
Today’s enterprise is being driven by disruptive competitive and human capital requirements to provide enterprise application access through not only desktops, but also mobile devices. To retrofit existing programs across all these devices using traditional programming methods is very costly and time consuming – often prohibitively so. In his session at @ThingsExpo, Jesse Shiah, CEO, President, and Co-Founder of AgilePoint Inc., discussed how you can create applications that run on all mobile ...
"We have developers who are really passionate about getting their code out to customers, no matter what, in the shortest possible time. Operations are very focused on procedures and policies," explained Stan Klimoff, CTO of Qubell, in this SYS-CON.tv interview at DevOps Summit, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Agility is top of mind for Cloud/Service providers and Enterprises alike. Policy Driven Data Center provides a policy model for application deployment by decoupling application needs from the underlying infrastructure primitives. In his session at 15th Cloud Expo, David Klebanov, a Technical Solutions Architect with Cisco Systems, discussed how it differentiates from the software-defined top-down control by offering a declarative approach to allow faster and simpler application deployment. Davi...
Sematext is a globally distributed organization that builds innovative Cloud and On Premises solutions for performance monitoring, alerting and anomaly detection (SPM), log management and analytics (Logsene), and search analytics (SSA). We also provide Search and Big Data consulting services and offer 24/7 production support for Solr and Elasticsearch.