Welcome!

SDN Journal Authors: Pat Romanski, Elizabeth White, Carmen Gonzalez, Liz McMillan, Sanjeev Sharma

Related Topics: SDN Journal, Java, SOA & WOA, Linux, Virtualization

SDN Journal: Blog Feed Post

SDN's Eventually Consistent Network Problem

Clustering controllers to address scalability concerns introduces a well-understood problem: consistency

One of the benefits of SDN is centralized control. That is, there is a single repository containing the known current state of the entire network. It is this centralization that enables intelligent application of new policies to govern and control the network - from new routes to user experience services like QoS. Because there is a single entity which has visibility into the state of the network as a whole, it can examine the topology at any given point and make determinations as to where this packet and that should be routed, how it is prioritized and even whether or not it is allowed to traverse the network.

It's a pretty powerful concept for networks, which traditionally distribute network state as individual configuration files across the data path.

network-state-traditional

Most of the focus of SDN is on the replacement of manual and scripted configuration methods with an API-driven mechanism. Whether that's OpenFlow or OpFlex or some other protocol is not really important as the benefit of operationalization is to provide a consistent interface from the perspective of the operator, not the device.

network-state-sdn

This is a real benefit; operationalization across operations and dev has proven to produce tangible benefits in the form of improved time to market and a reduction in errors. By centralizing network state in a controller, this model provides a comprehensive view of the network at any given moment. Because the controller is not just a repository but an active participant in the flow of data across the network, this visibility enables the controller to understand how to (ostensibly) non-disruptively change routes or apply new policies in real-time.

The benefit itself is not in question. What is in question is what happens when the controller of this new software-defined architecture becomes overwhelmed, and how to preserve that benefit when the centralized model must decentralize in order to scale.

The Eventually Consistent Problem Comes to the Network

Eventual consistency is nothing new. It has always been an issue when scaling applications, particularly those that rely on shared data. Consider Amazon, if you will. If you and I are both shopping for the same thing, and I order before you, it may take seconds or more before the database is updated. If you were in the middle of ordering at the same time, you and I may be contending for the same item. Because my order takes a moment or two to propagate through the system, your view of the database (the availability of the item) is inconsistent with mine.

It is assumed that eventually our views will be consistent, and that this age old unsolved problem of distributed computing simply must be accepted as unsolvable for now,  Thus systems are designed with this principle in mind. Which means we end up back with Brewer's CAP Theorem staring us in the face and reminding us we can't be perfectly consistent in a distributed system, so we must deal with systems in such a way as to achieve eventually consistency.

At issue is the ability of a software controller to scale. The controller is, by design and necessity, part of the data path. That is both a blessing and a curse. It is from this fact that the real-time adaption of network behavior can be achieved, but it is also this fact which forces issues of scale and introduces the need for a distributed system from which the problem of eventual consistency derives. That's because more than one system will be the "master" repository for a given portion of network state. Even if one controller is designated as master of the network universe and thus maintains the "official" state of the network, there are those moments when the secondary (or tertiary) controller has modified the "official" state and introduces inconsistency. In the moments between when the two network states merge, there is the possibility that the first (master) controller will also try to make a decision based on information that relies on network state that is no longer valid. If Controller B, for example, removes a port from a VLAN, and before that state can propagate to the master, a packet arrives in the fabric, destined for that port, Controller A will have no way to know that it is no longer participating in the VLAN and will, as expected, tell the switch to route to that port.

The issue will be shortly resolved, assuming timely synchronization of network state across the cluster, but in the meantime performance (or availability) may be negatively impacted.

clustered-sdn

The problem with eventual consistency in the network is one of magnitude. Eventually consistent views of books in stock at Amazon has a very different impact than an eventually consistent view of the network underpinning today's applications and ultimately the business. We're not talking about losing out on a book, we're talking about potentially disrupting hundreds or thousands of applications that translates into hundreds of thousands or even millions of dollars. Ponemon's 2013 Cost of Data Center Outages proves this case out: "The average reported outage incident length was 86 minutes, resulting in average cost per incident of about $690,200."

Eventual consistency of the network may turn out to be quite costly.

Common Themes: Reliability and Control

This is not a new problem. This issue of stateful failover as applied to scalability of both infrastructure and applications is one that application delivery has been dealing with, well, for over a decade now. The issue when dealing with distributed state is always one of replication and synchronization between those devices providing for reliability. That doesn't change just because we move from one form factor to another, or from on-premise to cloud. The issue remains: how do we maintain an authoritative view of the state of an <application or network> while still enabling the scale necessary to meet demand?

While we (as in the industry "we") recognize that true stateful reliability - and thus perfect consistency - is currently unachievable due to the constraints of distributed system design, we also recognize that we can get pretty darn close. From an application perspective, the intelligence embedded in a service fabric is more than able to deal with the problem with minimal introduction of latency. That is, there will be a slight pause and some disruption when failure or disruption occurs in the network but if the service fabric is smart enough, the disruption is experienced by the end user as no more than a slight hiccup - likely unnoticeable.

But the further down the stack you go, toward core network function, the more disruptive such a hiccup is going to be.

That's one of the reasons a "centralized control, decentralized execution" architecture makes more sense from a network perspective. Such a model maintains authoritative control over the state of the network, but empowers individual components in the various fabrics (stateless L2-4 and stateful L4-7) that make up "the network" to maintain its own prescriptive configuration and take action when necessary based on the abstracted policies of the network as a whole.

Everyone likes to posit an answer to what will be the "killer app" for SDN. But before we can worry about that, we might want to consider what may be the "showstopper" obstacles for SDN. Eventual consistency when scaling controllers is one of those issues.

Because without a reliable and consistent network world, there is no application world. Or at least not one that users will be excited to rely on.

Read the original blog entry...

More Stories By Lori MacVittie

Lori MacVittie is responsible for education and evangelism of application services available across F5’s entire product suite. Her role includes authorship of technical materials and participation in a number of community-based forums and industry standards organizations, among other efforts. MacVittie has extensive programming experience as an application architect, as well as network and systems development and administration expertise. Prior to joining F5, MacVittie was an award-winning Senior Technology Editor at Network Computing Magazine, where she conducted product research and evaluation focused on integration with application and network architectures, and authored articles on a variety of topics aimed at IT professionals. Her most recent area of focus included SOA-related products and architectures. She holds a B.S. in Information and Computing Science from the University of Wisconsin at Green Bay, and an M.S. in Computer Science from Nova Southeastern University.

@CloudExpo Stories
When an enterprise builds a hybrid IaaS cloud connecting its data center to one or more public clouds, security is often a major topic along with the other challenges involved. Security is closely intertwined with the networking choices made for the hybrid cloud. Traditional networking approaches for building a hybrid cloud try to kludge together the enterprise infrastructure with the public cloud. Consequently this approach requires risky, deep "surgery" including changes to firewalls, subnets...
Ixia develops amazing products so its customers can connect the world. Ixia helps its customers provide an always-on user experience through fast, secure delivery of dynamic connected technologies and services. Through actionable insights that accelerate and secure application and service delivery, Ixia's customers benefit from faster time to market, optimized application performance and higher-quality deployments.
SYS-CON Events announced today that Stratogent will exhibit at SYS-CON's 15th International Cloud Expo®, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Stratogent is a custom managed services organization based in San Mateo, California. We design, implement, and support mission critical infrastructure 24x7 on premises, in datacenters and in the Cloud. Since 2005, we have acted as an extension of internal IT teams, achieving a customer reten...
SYS-CON Events announced today that Harbinger Systems will exhibit at SYS-CON's 15th International Cloud Expo®, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Harbinger Systems is a global company providing software technology services. Since 1990, Harbinger has developed a strong customer base worldwide. Its customers include software product companies ranging from hi-tech start-ups in Silicon Valley to leading product companies in the US a...
SYS-CON Events announces a new pavilion on the Cloud Expo floor where WebRTC converges with the Internet of Things. Pavilion will showcase WebRTC and the Internet of Things. The Internet of Things (IoT) is the most profound change in personal and enterprise IT since the creation of the Worldwide Web more than 20 years ago. All major researchers estimate there will be tens of billions devices--computers, smartphones, tablets, and sensors – connected to the Internet by 2020. This number will con...
The only place to be June 9-11 is Cloud Expo & @ThingsExpo 2015 East at the Javits Center in New York City. Join us there as delegates from all over the world come to listen to and engage with speakers & sponsors from the leading Cloud Computing, IoT & Big Data companies. Cloud Expo & @ThingsExpo are the leading events covering the booming market of Cloud Computing, IoT & Big Data for the enterprise. Speakers from all over the world will be hand-picked for their ability to explore the economic...
SYS-CON Events announced today that Cloudian, Inc., the leading provider of hybrid cloud storage solutions, has been named “Bronze Sponsor” of SYS-CON's 15th International Cloud Expo®, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Cloudian is a Foster City, Calif.-based software company specializing in cloud storage. Cloudian HyperStore® is an S3-compatible cloud object storage platform that enables service providers and enterprises to bui...
SYS-CON Events announced today that Gridstore™, the leader in software-defined storage (SDS) purpose-built for Windows Servers and Hyper-V, will exhibit at SYS-CON's 15th International Cloud Expo®, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Gridstore™ is the leader in software-defined storage purpose built for virtualization that is designed to accelerate applications in virtualized environments. Using its patented Server-Side Virtual C...
As the Internet of Things unfolds, mobile and wearable devices are blurring the line between physical and digital, integrating ever more closely with our interests, our routines, our daily lives. Contextual computing and smart, sensor-equipped spaces bring the potential to walk through a world that recognizes us and responds accordingly. We become continuous transmitters and receivers of data. In his session at Internet of @ThingsExpo, Andrew Bolwell, Director of Innovation for HP’s Printing a...
SAP is delivering break-through innovation combined with fantastic user experience powered by the market-leading in-memory technology, SAP HANA. In his General Session at 15th Cloud Expo, Thorsten Leiduck, VP ISVs & Digital Commerce, SAP, will discuss how SAP and partners provide cloud and hybrid cloud solutions as well as real-time Big Data offerings that help companies of all sizes and industries run better. SAP launched an application challenge to award the most innovative SAP HANA and SAP ...
The Internet of Things (IoT) promises to evolve the way the world does business; however, understanding how to apply it to your company can be a mystery. Most people struggle with understanding the potential business uses or tend to get caught up in the technology, resulting in solutions that fail to meet even minimum business goals. In his session at Internet of @ThingsExpo, Jesse Shiah, CEO / President / Co-Founder of AgilePoint Inc., will show what is needed to leverage the IoT to transform...
SYS-CON Events announced today that AIC, a leading provider of OEM/ODM server and storage solutions, will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. AIC is a leading provider of both standard OTS, off-the-shelf, and OEM/ODM server and storage solutions. With expert in-house design capabilities, validation, manufacturing and production, AIC's broad selection of products are highly flexible and are conf...
We are all here because we are sold on the transformative promise of The Cloud. But what good is all of this ephemeral, on-demand infrastructure if your usage doesn't actually improve the agility and speed of your business? How must Operations adapt in order to avoid stifling your Cloud initiative? In his session at DevOps Summit, Damon Edwards, co-founder and managing partner of the DTO Solutions, will highlight the successful organizational, process, and tooling patterns of high-performing c...
SYS-CON Events announced today that O'Reilly Media has been named “Media Sponsor” of SYS-CON's 15th International Cloud Expo®, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. O'Reilly Media spreads the knowledge of innovators through its books, online services, magazines, and conferences. Since 1978, O'Reilly Media has been a chronicler and catalyst of cutting-edge development, homing in on the technology trends that really matter and spurri...
SYS-CON Events announced today that Aria Systems, the recurring revenue expert, has been named "Bronze Sponsor" of SYS-CON's 15th International Cloud Expo®, which will take place on November 4-6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Aria Systems helps leading businesses connect their customers with the products and services they love. Industry leaders like Pitney Bowes, Experian, AAA NCNU, VMware, HootSuite and many others choose Aria to power their recurring revenue bu...
The Transparent Cloud-computing Consortium (abbreviation: T-Cloud Consortium) will conduct research activities into changes in the computing model as a result of collaboration between "device" and "cloud" and the creation of new value and markets through organic data processing High speed and high quality networks, and dramatic improvements in computer processing capabilities, have greatly changed the nature of applications and made the storing and processing of data on the network commonplace. ...
Seagate has a strong track record of collaborating with others to develop better cloud solutions. The Seagate Cloud Builder Alliance program, for example, leverages the company’s knowledge of storage and cloud-optimized solutions to give cloud service providers the customized, flexible and scalable server and storage solutions to meet the high levels of service their customers demand. Seagate also is a member of the OpenStack Foundation and Open Compute Project to help define and promote open-so...
The Internet of Things (IoT) is going to require a new way of thinking and of developing software for speed, security and innovation. This requires IT leaders to balance business as usual while anticipating for the next market and technology trends. Cloud provides the right IT asset portfolio to help today’s IT leaders manage the old and prepare for the new. Today the cloud conversation is evolving from private and public to hybrid. This session will provide use cases and insights to reinforce t...
What process has your provider undertaken to ensure that the cloud tenant will receive predictable performance and service? What was involved in the planning? Who owns and operates the data center? What technology is being used? How is it being supported? In his session at 14th Cloud Expo, Dave Weisbrot, Cloud Business Manager for QTS, will provide the attendees a look into what it takes to stand up and stand behind a highly available certified cloud IaaS.
SYS-CON Events announced today that Gigaom Research has been named "Media Sponsor" of SYS-CON's 15th International Cloud Expo®, which will take place on November 4-6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Ashar Baig, Research Director, Cloud, at Gigaom Research, will also lead a Power Panel on the topic "Choosing the Right Cloud Option." Gigaom Research provides timely, in-depth analysis of emerging technologies for individual and corporate subscribers. Gigaom Research'...