Welcome!

SDN Journal Authors: Stefan Bernbo, Michel Courtoy, Amitabh Sinha, Mike Wood, Liz McMillan

Related Topics: SDN Journal, Java IoT, Containers Expo Blog, @CloudExpo, Cloud Security, @BigDataExpo

SDN Journal: Blog Post

Stateless Transport Tunneling (STT) Meets the Network

At a high level the concepts of larger packets, hardware offload, reduced CPU load and interrupts all make sense

Last week I walked through the packet formats for VXLAN and NVGRE specifically focused on ways by which the overlay packets provide information to the physical network that help the physical network. Some of the initial extreme thoughts that the overlay and physical network can and should be completely ignorant of each other have softened more recently and more pragmatic thoughts of collaborating layers are being articulated. At Plexxi we have often mentioned that we believe the physical network and the overlay need to be closely orchestrated to get the most benefit out of the total network solution. And orchestration != ECMP.

In addition to VXLAN and NVGRE, Stateless Transport Tunneling (STT) is an encapsulation mechanism used by VMware, mostly for communication between server based vSwitches. It is a bit more involved and complicated than VXLAN and NVGRE, mostly because it was designed to carry large data packets, up to 64 Kbytes. Physical networks have limitations on the size of a packet that can be transferred. Ethernet standard maximum transmission unit (MTU) used to be 1500 bytes, but most ethernet devices these days can support jumbo packets allowing packets of 4, 9 or even 16 Kbytes in size. Even at those sizes, large data transfers are somewhat hampered by the work involved in taking a large chunk of data and then chopping them up into smaller portions to be transmitted. In a response to this, hardware vendors have taken some of this functionality and added it to the Network Interface Cards (NICs) on servers and have them do most of this segmentation and re-assembly work based on how TCP takes large portions of data and chops them into smaller segments. Doing his in hardware means it can be done faster, but more importantly, it removes this burden from the server CPUs, allowing them to do other (more useful) work.

STT was designed to make use of these TCP capabilities in NICs. STT can take ethernet packets up to 64 Kbytes from a VM on a server, and tunnel it to its destination as a 64 Kbyte entity. This STT frame has to be chopped into smaller pieces to match the MTU of the physical network, but an STT packet looks just like a TCP segment to the receiving NICs, allowing them to reconstruct the original 64 Kbyte packet without needing the CPU.

When the sending tunnel endpoint receives a large chunk of data to be transmitted at another VM at the other side of a tunnel, the vSwitch takes several steps to encapsulate this packet. First, it adds an STT Frame Header to the packet.

STT Frame Format 1

The STT Header is 18 bytes in length and has a variety of administrative fields, but the key field is the Context ID. This is a 64 bit field and its intended use is similar to the VXLAN Network Identifier (VNI) or the NVGRE Virtual Subnet ID (VSID). While the semantics of this field are somewhat defined, its value and how to use it is left open in the latest specifications. Its main purpose is to provide the receiving tunnel endpoint the information it needs to determine where this packet needs to be sent after decapsulation.

After the STT Frame Header has been added, this new packet (original packet  + new STT header) is chopped into smaller pieces so that each piece is at least 62 bytes smaller than the MTU of the physical network. Each of these new segments receives 24 byte TCP like header, a normal 20 byte IP header, and of course the final 18 byte Ethernet header before transmission. The magic (or ugliness for those less enamored by STT) is in the TCP like header. These 24 bytes are formatted just like a normal TCP header to ensure the hardware in the NICs can re-assemble segments that belong together. The traditional Acknowledgement field in TCP is used as a fragment ID, essentially telling the NIC that all packets/segments that come in with the same fragment ID belong together and should be reassembled into the larger original ethernet frame. The traditional Sequence number is used as an offset indicator, to tell the NIC in what order the fragments need to be put together.

STT Frame Format 2

Similar to VXLAN and NVGRE described last week, STT has a mechanism to create entropy for the physical network to distinguish flows from each other and allow them to be balanced using ECMP (or link aggregation – LAG) based deployments. In STT, the TCP source port is used to create entropy. The originating tunnel end point will use some hash calculation on the original packets header information and use the result to populate the TCP source port. Switches in the physical network can now use the TCP port information from the tunneled packet in their hash calculation for ECMP or LAG packet distribution.

While STT is likely to be more efficient than either VXLAN or NVGRE for the transfer of large amount of information because it offloads the segmentation and re-assembly, it carries significantly more overhead than either VXLAN or NVGRE in additional header information for smaller packets. STT adds 80 bytes of new header to a VM originated ethernet packet for the first segment of this packet, 62 for each following segment. Compare that to a consistent 46 bytes for each NVGRE encapsulated packet, and 54 bytes for VXLAN. For traffic between VMs on the same server this may not matter, but it certainly does for traffic carried across the physical network. For the plentiful mice flows, we have likely doubled the size and bandwidth required for each.

A probably more significant drawback of STT comes from its strength. Designed for large packet transfers, once an original packet is encapsulated with STT header, chopped into parts, then encapsulated into individual ethernet, IP and TCP (like) headers, only the first packet provides any clue or context of the original source, destination, protocol, application and other content. The relevant pieces of that will only be found in the first segment, any follow up segments only provide enough information about the tunnel endpoints and no other original context without the first segment. And that makes debugging really hard. It also makes it hard to differentiate traffic on the physical network, even at a very high level Virtual Network identifier. And every existing network based service (realizing that one of the goals of overlay networks is to push this to the vSwitches themselves) will also have a hard time deciding what to do with these packets.

At a high level the concepts of larger packets, hardware offload, reduced CPU load and interrupts all make sense. But most data center ethernet networks can easily support 9k or even 16k packets, so perhaps the gap between 16k packet based transfer and 64k semi-stream based communication is really not that much considering that the bulk of packets are small to begin with (remember those mice and elephants?). Perhaps aligning the MTU of the virtual port with that of the network may be worthwhile to have the STT and original header in each and every packet on the wire. Regardless of whether that is a real wire, or a virtual one.

[Today's fun fact: One of the primary reasons the Mayflower pilgrims ended their voyage at Plymouth Rock was pretty much the same reason people today suspend their journeys: they ran out of beer. No need for a funny punch line on that one]

The post Stateless Transport Tunneling (STT) meets the Network appeared first on Plexxi.

Read the original blog entry...

More Stories By Marten Terpstra

Marten Terpstra is a Product Management Director at Plexxi Inc. Marten has extensive knowledge of the architecture, design, deployment and management of enterprise and carrier networks.

@CloudExpo Stories
SYS-CON Events announced today that Datanami has been named “Media Sponsor” of SYS-CON's 21st International Cloud Expo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Datanami is a communication channel dedicated to providing insight, analysis and up-to-the-minute information about emerging trends and solutions in Big Data. The publication sheds light on all cutting-edge technologies including networking, storage and applications, and the...
SYS-CON Events announced today that EnterpriseTech has been named “Media Sponsor” of SYS-CON's 21st International Cloud Expo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. EnterpriseTech is a professional resource for news and intelligence covering the migration of high-end technologies into the enterprise and business-IT industry, with a special focus on high-tech solutions in new product development, workload management, increased effi...
You know you need the cloud, but you’re hesitant to simply dump everything at Amazon since you know that not all workloads are suitable for cloud. You know that you want the kind of ease of use and scalability that you get with public cloud, but your applications are architected in a way that makes the public cloud a non-starter. You’re looking at private cloud solutions based on hyperconverged infrastructure, but you’re concerned with the limits inherent in those technologies.
For organizations that have amassed large sums of software complexity, taking a microservices approach is the first step toward DevOps and continuous improvement / development. Integrating system-level analysis with microservices makes it easier to change and add functionality to applications at any time without the increase of risk. Before you start big transformation projects or a cloud migration, make sure these changes won’t take down your entire organization.
Cloud promises the agility required by today’s digital businesses. As organizations adopt cloud based infrastructures and services, their IT resources become increasingly dynamic and hybrid in nature. Managing these require modern IT operations and tools. In his session at 20th Cloud Expo, Raj Sundaram, Senior Principal Product Manager at CA Technologies, will discuss how to modernize your IT operations in order to proactively manage your hybrid cloud and IT environments. He will be sharing bes...
A look across the tech landscape at the disruptive technologies that are increasing in prominence and speculate as to which will be most impactful for communications – namely, AI and Cloud Computing. In his session at 20th Cloud Expo, Curtis Peterson, VP of Operations at RingCentral, highlighted the current challenges of these transformative technologies and shared strategies for preparing your organization for these changes. This “view from the top” outlined the latest trends and developments i...
Automation is enabling enterprises to design, deploy, and manage more complex, hybrid cloud environments. Yet the people who manage these environments must be trained in and understanding these environments better than ever before. A new era of analytics and cognitive computing is adding intelligence, but also more complexity, to these cloud environments. How smart is your cloud? How smart should it be? In this power panel at 20th Cloud Expo, moderated by Conference Chair Roger Strukhoff, paneli...
Hardware virtualization and cloud computing allowed us to increase resource utilization and increase our flexibility to respond to business demand. Docker Containers are the next quantum leap - Are they?! Databases always represented an additional set of challenges unique to running workloads requiring a maximum of I/O, network, CPU resources combined with data locality.
SYS-CON Events announced today that MobiDev, a client-oriented software development company, will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place October 31-November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. MobiDev is a software company that develops and delivers turn-key mobile apps, websites, web services, and complex software systems for startups and enterprises. Since 2009 it has grown from a small group of passionate engineers and business...
SYS-CON Events announced today that GrapeUp, the leading provider of rapid product development at the speed of business, will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place October 31-November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Grape Up is a software company, specialized in cloud native application development and professional services related to Cloud Foundry PaaS. With five expert teams that operate in various sectors of the market acr...
SYS-CON Events announced today that Ayehu will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on October 31 - November 2, 2017 at the Santa Clara Convention Center in Santa Clara California. Ayehu provides IT Process Automation & Orchestration solutions for IT and Security professionals to identify and resolve critical incidents and enable rapid containment, eradication, and recovery from cyber security breaches. Ayehu provides customers greater control over IT infras...
Artificial intelligence, machine learning, neural networks. We’re in the midst of a wave of excitement around AI such as hasn’t been seen for a few decades. But those previous periods of inflated expectations led to troughs of disappointment. Will this time be different? Most likely. Applications of AI such as predictive analytics are already decreasing costs and improving reliability of industrial machinery. Furthermore, the funding and research going into AI now comes from a wide range of com...
SYS-CON Events announced today that SourceForge has been named “Media Sponsor” of SYS-CON's 21st International Cloud Expo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. SourceForge is the largest, most trusted destination for Open Source Software development, collaboration, discovery and download on the web serving over 32 million viewers, 150 million downloads and over 460,000 active development projects each and every month.
In this presentation, Striim CTO and founder Steve Wilkes will discuss practical strategies for counteracting fraud and cyberattacks by leveraging real-time streaming analytics. In his session at @ThingsExpo, Steve Wilkes, Founder and Chief Technology Officer at Striim, will provide a detailed look into leveraging streaming data management to correlate events in real time, and identify potential breaches across IoT and non-IoT systems throughout the enterprise. Strategies for processing massive ...
"Our strategy is to focus on the hyperscale providers - AWS, Azure, and Google. Over the last year we saw that a lot of developers need to learn how to do their job in the cloud and we see this DevOps movement that we are catering to with our content," stated Alessandro Fasan, Head of Global Sales at Cloud Academy, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"We focus on composable infrastructure. Composable infrastructure has been named by companies like Gartner as the evolution of the IT infrastructure where everything is now driven by software," explained Bruno Andrade, CEO and Founder of HTBase, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
SYS-CON Events announced today that Conference Guru has been named “Media Sponsor” of SYS-CON's 21st International Cloud Expo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. A valuable conference experience generates new contacts, sales leads, potential strategic partners and potential investors; helps gather competitive intelligence and even provides inspiration for new products and services. Conference Guru works with conference organi...
SYS-CON Events announced today that Cloud Academy named "Bronze Sponsor" of 21st International Cloud Expo which will take place October 31 - November 2, 2017 at the Santa Clara Convention Center in Santa Clara, CA. Cloud Academy is the industry’s most innovative, vendor-neutral cloud technology training platform. Cloud Academy provides continuous learning solutions for individuals and enterprise teams for Amazon Web Services, Microsoft Azure, Google Cloud Platform, and the most popular cloud com...
What's the role of an IT self-service portal when you get to continuous delivery and Infrastructure as Code? This general session showed how to create the continuous delivery culture and eight accelerators for leading the change. Don Demcsak is a DevOps and Cloud Native Modernization Principal for Dell EMC based out of New Jersey. He is a former, long time, Microsoft Most Valuable Professional, specializing in building and architecting Application Delivery Pipelines for hybrid legacy, and cloud ...
With major technology companies and startups seriously embracing Cloud strategies, now is the perfect time to attend 21st Cloud Expo October 31 - November 2, 2017, at the Santa Clara Convention Center, CA, and June 12-14, 2018, at the Javits Center in New York City, NY, and learn what is going on, contribute to the discussions, and ensure that your enterprise is on the right path to Digital Transformation.