Welcome!

SDN Journal Authors: John Walsh, Elizabeth White, Liz McMillan, Sven Olav Lund, Simon Hill

Related Topics: Containers Expo Blog, Java IoT, Linux Containers, Open Source Cloud, @CloudExpo, SDN Journal

Containers Expo Blog: Article

Preserving Storage I/O for Critical Applications

Is hypervisor side I/O throttling enough?

While server virtualization has unprecedented advantages, it has thrown some challenges as well. Prior to server virtualization, each application used to run on a dedicated server with dedicated storage connected through dedicated switch ports. Needless to say, this approach had multiple limitations in terms of lack of flexibility, underutilization, etc. But, it provided guaranteed hardware resources such as CPU, network, memory, and storage bandwidth for each application.

With server virtualization, the underlying hardware resources are shared. Hypervisors do a good job in terms of providing guaranteed CPU power, network bandwidth, and memory for each application that shares these resources. However, when it comes to storage, things get complicated. Hypervisors don't have access to resources running inside the storage arrays and can only control access to storage through I/O throttling. This is not enough to preserve the storage I/O for critical applications. As a result, critical applications stay sidelined and the old school method of dedicated hardware approach prevails.

Many factors contribute to the complexity of storage even though the general perception is that storage is just a piece of hardware. In reality, there is a complex piece of software running in the system utilizing various hardware components such as CPU, cache, disk, and network, inside the storage array. Incoming I/Os add various amounts of pressure on these resources based on the traffic pattern. For example, 100 IOPS from Oracle is quite different from 100 IOPS from MS Exchange as far as the storage array is concerned. They are different in terms of read/write ratio, block size, random sequencial, disk placements, deduplication, compression, and so on.

Irrespective of I/O throttling at the hypervisor level, customers often complain: "My applications were running fine till yesterday; but why this sluggish performance now? There can be many factors contributing to this situation. Let us examine some of the common ones:

  • A new application replaces the old application, consuming storage from the array. From the hypervisor point of view, both the new and old applications are sending the same amount of IOPS towards the storage array. But, the new application completely wipes out cache for all the other applications in the storage array since it accesses the same data again and again which keeps this cache hot while depleting other caches.
  • Some applications change the traffic patterns drastically. The classic example is that of Accounting applications that generate month-end or quarter-end reports, depleting the storage resource for other ones. The hypervisor doesn't differentiate between regular I/O and report generation I/O.
  • An application creates and deletes large files. As far as the hypervisor is concerned, that's just a few I/Os, but that strains the storage array resources heavily.
  • One volume is configured for more frequent snapshot creation and deletion; something that is completely inaccessible to the hypervisor.
  • An application that accesses historical data starts to dig data from passive tiers, transparent to the hypervisor, but adds more load on the storage arrays.
  • A filesystem in the storage array is aging and results in more fragmentation. This costs more storage resources to pull out the same amount of data as before. But the hypervisor has no clue about these aspects.
  • Storage arrays' in-house activities, such as garbage collection reduce the overall capability of the array. Here, a simple hypervisor I/O throttling doesn't guarantee the critical applications the required IOPS.
  • Storage arrays' component failures result in reduced capabilities. The hypervisor can't do much to maintain the I/O level for critical applications.

No. Hypervisor side I/O throttling is not just enough. Storage needs to be intelligent to deliver differential, consistent, guaranteed IOPS to each application or virtual machines sharing the storage array. To get end-to-end SLA, all the three layers of the data center such as hypervisor/server, network, and storage need to satisfy individual SLAs. One component cannot act on behalf of the other.

All these years of experience in networking and storage have taught me: storage is the most complex part when it comes to delivering SLA.

More Stories By Felix Xavier

Felix Xavier is Founder and CTO of CloudByte. He has more than 15 years of development and technology management experience. He has built many high-energy technology teams, re-architected products and developed features from scratch. Most recently, Felix helped NetApp gain leadership position in storage array-based data protection by driving innovations around its product suite. He has filed numerous patents with the US patent office around core storage technologies. Prior to this, Felix worked at Juniper, Novell and IBM, where he handled networking technologies, including LAN, WAN and security protocols and Intrusion Prevention Systems (IPS). Felix has master’s degrees in technology and business administration.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@CloudExpo Stories
The question before companies today is not whether to become intelligent, it’s a question of how and how fast. The key is to adopt and deploy an intelligent application strategy while simultaneously preparing to scale that intelligence. In her session at 21st Cloud Expo, Sangeeta Chakraborty, Chief Customer Officer at Ayasdi, provided a tactical framework to become a truly intelligent enterprise, including how to identify the right applications for AI, how to build a Center of Excellence to oper...
"IBM is really all in on blockchain. We take a look at sort of the history of blockchain ledger technologies. It started out with bitcoin, Ethereum, and IBM evaluated these particular blockchain technologies and found they were anonymous and permissionless and that many companies were looking for permissioned blockchain," stated René Bostic, Technical VP of the IBM Cloud Unit in North America, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Conventi...
Gemini is Yahoo’s native and search advertising platform. To ensure the quality of a complex distributed system that spans multiple products and components and across various desktop websites and mobile app and web experiences – both Yahoo owned and operated and third-party syndication (supply), with complex interaction with more than a billion users and numerous advertisers globally (demand) – it becomes imperative to automate a set of end-to-end tests 24x7 to detect bugs and regression. In th...
In his session at 21st Cloud Expo, James Henry, Co-CEO/CTO of Calgary Scientific Inc., introduced you to the challenges, solutions and benefits of training AI systems to solve visual problems with an emphasis on improving AIs with continuous training in the field. He explored applications in several industries and discussed technologies that allow the deployment of advanced visualization solutions to the cloud.
Agile has finally jumped the technology shark, expanding outside the software world. Enterprises are now increasingly adopting Agile practices across their organizations in order to successfully navigate the disruptive waters that threaten to drown them. In our quest for establishing change as a core competency in our organizations, this business-centric notion of Agile is an essential component of Agile Digital Transformation. In the years since the publication of the Agile Manifesto, the conn...
"MobiDev is a software development company and we do complex, custom software development for everybody from entrepreneurs to large enterprises," explained Alan Winters, U.S. Head of Business Development at MobiDev, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
Large industrial manufacturing organizations are adopting the agile principles of cloud software companies. The industrial manufacturing development process has not scaled over time. Now that design CAD teams are geographically distributed, centralizing their work is key. With large multi-gigabyte projects, outdated tools have stifled industrial team agility, time-to-market milestones, and impacted P&L stakeholders.
"ZeroStack is a startup in Silicon Valley. We're solving a very interesting problem around bringing public cloud convenience with private cloud control for enterprises and mid-size companies," explained Kamesh Pemmaraju, VP of Product Management at ZeroStack, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
Enterprises are adopting Kubernetes to accelerate the development and the delivery of cloud-native applications. However, sharing a Kubernetes cluster between members of the same team can be challenging. And, sharing clusters across multiple teams is even harder. Kubernetes offers several constructs to help implement segmentation and isolation. However, these primitives can be complex to understand and apply. As a result, it’s becoming common for enterprises to end up with several clusters. Thi...
"Space Monkey by Vivent Smart Home is a product that is a distributed cloud-based edge storage network. Vivent Smart Home, our parent company, is a smart home provider that places a lot of hard drives across homes in North America," explained JT Olds, Director of Engineering, and Brandon Crowfeather, Product Manager, at Vivint Smart Home, in this SYS-CON.tv interview at @ThingsExpo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
In his session at 21st Cloud Expo, Carl J. Levine, Senior Technical Evangelist for NS1, will objectively discuss how DNS is used to solve Digital Transformation challenges in large SaaS applications, CDNs, AdTech platforms, and other demanding use cases. Carl J. Levine is the Senior Technical Evangelist for NS1. A veteran of the Internet Infrastructure space, he has over a decade of experience with startups, networking protocols and Internet infrastructure, combined with the unique ability to it...
"Infoblox does DNS, DHCP and IP address management for not only enterprise networks but cloud networks as well. Customers are looking for a single platform that can extend not only in their private enterprise environment but private cloud, public cloud, tracking all the IP space and everything that is going on in that environment," explained Steve Salo, Principal Systems Engineer at Infoblox, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Conventio...
"Codigm is based on the cloud and we are here to explore marketing opportunities in America. Our mission is to make an ecosystem of the SW environment that anyone can understand, learn, teach, and develop the SW on the cloud," explained Sung Tae Ryu, CEO of Codigm, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
High-velocity engineering teams are applying not only continuous delivery processes, but also lessons in experimentation from established leaders like Amazon, Netflix, and Facebook. These companies have made experimentation a foundation for their release processes, allowing them to try out major feature releases and redesigns within smaller groups before making them broadly available. In his session at 21st Cloud Expo, Brian Lucas, Senior Staff Engineer at Optimizely, discussed how by using ne...
"Cloud Academy is an enterprise training platform for the cloud, specifically public clouds. We offer guided learning experiences on AWS, Azure, Google Cloud and all the surrounding methodologies and technologies that you need to know and your teams need to know in order to leverage the full benefits of the cloud," explained Alex Brower, VP of Marketing at Cloud Academy, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clar...
Vulnerability management is vital for large companies that need to secure containers across thousands of hosts, but many struggle to understand how exposed they are when they discover a new high security vulnerability. In his session at 21st Cloud Expo, John Morello, CTO of Twistlock, addressed this pressing concern by introducing the concept of the “Vulnerability Risk Tree API,” which brings all the data together in a simple REST endpoint, allowing companies to easily grasp the severity of the ...
While some developers care passionately about how data centers and clouds are architected, for most, it is only the end result that matters. To the majority of companies, technology exists to solve a business problem, and only delivers value when it is solving that problem. 2017 brings the mainstream adoption of containers for production workloads. In his session at 21st Cloud Expo, Ben McCormack, VP of Operations at Evernote, discussed how data centers of the future will be managed, how the p...
"NetApp is known as a data management leader but we do a lot more than just data management on-prem with the data centers of our customers. We're also big in the hybrid cloud," explained Wes Talbert, Principal Architect at NetApp, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
Coca-Cola’s Google powered digital signage system lays the groundwork for a more valuable connection between Coke and its customers. Digital signs pair software with high-resolution displays so that a message can be changed instantly based on what the operator wants to communicate or sell. In their Day 3 Keynote at 21st Cloud Expo, Greg Chambers, Global Group Director, Digital Innovation, Coca-Cola, and Vidya Nagarajan, a Senior Product Manager at Google, discussed how from store operations and ...
"We're focused on how to get some of the attributes that you would expect from an Amazon, Azure, Google, and doing that on-prem. We believe today that you can actually get those types of things done with certain architectures available in the market today," explained Steve Conner, VP of Sales at Cloudistics, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.