SDN Journal Authors: Liz McMillan, Yeshim Deniz, Elizabeth White, Pat Romanski, TJ Randall

Related Topics: @CloudExpo, SDN Journal

@CloudExpo: Blog Post

Training Wheels and Protective Gear By @PlexxiInc | @CloudExpo [#SDN]

This balancing act is part of what as made networking as complex as it has become

Throughout the development cycle of new features and functions for any network platform (or probably most other products not targeted at the mass market consumer) this one question will always come up: should we protect the user of our product from doing this? And “this” is always something that would allow the user of the product to really mess things up if not done right. As a product management organization you almost have to take a philosophical stand when it comes to these questions.

Protect the user
Sure enough, the question came up last week as part of the development of one our features. When putting the finishing touches on a feature that allows very direct control over some of the fundamental portions of what creates a Plexxi fabric, our QA team (very appropriately) raised the concern: if the user does this, bad things can happen, should we not allow the user to change this portion of the feature?

This balancing act is part of what as made networking as complex as it has become. As an industry we have been extremely flexible in what we have exposed to our users. We have given access to portions of our products that 99.9% of customers will never need, but unfortunately because of that 0.1% every networking product has tons of these little tweaks and knobs that could wreak havoc if used the wrong way.

We take a lot of pride in creating a network solution that is simple to use, simple to interact with, but extremely powerful under the hood. Direct access to all that power will lead to not only giving the customer a powerful weapon, but also the ammunition to use it. And like handling any weapon, you can really hurt yourself if you are not careful. Which comes back to the question at hand, how many safety valves do you put in place to make sure the user cannot hurt themselves?

The reason why
Some of these controls are buried fairly deep inside our products. They are meant for true power users and for the support teams of the vendors. And even beyond the support teams, there are tools and tricks inside our products that only the engineering teams know about, hidden even beyond the knowledge of support teams. Several years ago (in a previous job), we had a customer with a complex problem. Traffic was inconsistently forwarded and the belief was that there were communication problems between line cards and the main CPU card that would create inconsistant tables (the biggest challenge for any chassis based system).

Of course our development teams had tools embedded in the code to carefully examine and manipulate the tables and communications between these cards. Not exposed to a regular user, because they were potentially dangerous. And we proved that they were. During the execution of the command by one of my developers, he made a small typo in one of the arguments and boom went the switch. Crash and reboot. Customer very upset (for good reason, this was a production network), executive management very upset (also for good reason) and worse, the problem disappeared without us collecting the information we needed to attempt to fix it.

Different Answers for Different Tools
There is a difference between debug tools that allow engineers to look deep inside the switch versus common features that may have significant service consequences if not in expert hands. No matter how hard we try, the first category will continue to exist. As vendors we will bring portions of these tools to the user or support visible spectrum, but at the same time we will create new ones buried deep.

The latter category though is one where I favor a less protective approach. There are many ways by which you can completely disrupt your network service. Most of the services your network provide have been created with your own hands through provisioning and configuration and can therefore be disrupted by those same actions. When we create features and functions that are potentially dangerous, it is on us the vendor to make sure it is properly documented and explained. This way when you do make that mistake (and it will happen) we can refer to that 4 letter “read the documentation” response.

Off come the training wheels
When it comes to user configurable features and functions, every single one of them has the potential to disrupt service when used the wrong way. We as vendors should not shy away from giving you all the tools you need to create (and destroy) the service you need. And I do not believe anyone wants to step through one “Are you sure (Y/N)?” after another. Of course we need to make creating services easier for you. If you are a frequent reader of our blogs you know that is what we stand for. But we should not take away features because we are afraid you can shoot yourself in the foot. Any time in the past where we opted to give you a gun but keep the bullets behind a locked door, we have found someone that legitimately explained that he or she needed the bullets to solve their specific problem. And we unlocked the door.

There are ways to teach someone how to ride a bike without providing permanent training wheels. Documentation (for those few that read it), workflow based provisioning and configuration and solid default behaviors with predictable results can steer you clear of the dangers we have provided. And when you do fall off the bike and hurt your knee or elbow, well, you are less likely to try that maneuver again next time. That is how most of us learn. Including those developers that crash a customer production switch during a debug session. For every one of those “oops” moments there will be many where those hidden gems may have saved your network from disaster. Just like there is one customer for whom having the bullets makes the difference between a working service and one that just limps along.

[Today's fun fact: You burn more calories sleeping than watching TV. I enjoy combining the two, especially during some of the last few Thursday night NFL games.]

The post Training Wheels and Protective Gear appeared first on Plexxi.

Read the original blog entry...

More Stories By Marten Terpstra

Marten Terpstra is a Product Management Director at Plexxi Inc. Marten has extensive knowledge of the architecture, design, deployment and management of enterprise and carrier networks.

CloudEXPO Stories
Despite being the market leader, we recognized the need to transform and reinvent our business at Dynatrace, before someone else disrupted the market. Over the course of three years, we changed everything - our technology, our culture and our brand image. In this session we'll discuss how we navigated through our own innovator's dilemma, and share takeaways from our experience that you can apply to your own organization.
DXWorldEXPO LLC announced today that Nutanix has been named "Platinum Sponsor" of CloudEXPO | DevOpsSUMMIT | DXWorldEXPO New York, which will take place November 12-13, 2018 in New York City. Nutanix makes infrastructure invisible, elevating IT to focus on the applications and services that power their business. The Nutanix Enterprise Cloud Platform blends web-scale engineering and consumer-grade design to natively converge server, storage, virtualization and networking into a resilient, software-defined solution with rich machine intelligence.
Founded in 2002 and headquartered in Chicago, Nexum® takes a comprehensive approach to security. Nexum approaches business with one simple statement: “Do what’s right for the customer and success will follow.” Nexum helps you mitigate risks, protect your data, increase business continuity and meet your unique business objectives by: Detecting and preventing network threats, intrusions and disruptions Equipping you with the information, tools, training and resources you need to effectively manage IT risk Nexum, Latin for an arrangement by which one pledged one’s very liberty as security, Nexum is committed to ensuring your security. At Nexum, We Mean Security®.
Having been in the web hosting industry since 2002, dhosting has gained a great deal of experience while working on a wide range of projects. This experience has enabled the company to develop our amazing new product, which they are now excited to present! Among dHosting's greatest achievements, they can include the development of their own hosting panel, the building of their fully redundant server system, and the creation of dhHosting's unique product, Dynamic Edge.
The Transparent Cloud-computing Consortium (T-Cloud) is a neutral organization for researching new computing models and business opportunities in IoT era. In his session, Ikuo Nakagawa, Co-Founder and Board Member at Transparent Cloud Computing Consortium, will introduce the big change toward the "connected-economy" in the digital age. He'll introduce and describe some leading-edge business cases from his original points of view, and discuss models & strategies in the connected-economy. Nowadays, "digital innovation" is a big wave of business transformation based on digital technologies. IoT, Big Data, AI, FinTech and various leading-edge technologies are key components of such business drivers.