SDN Journal Authors: Elizabeth White, Yeshim Deniz, Liz McMillan, Pat Romanski, TJ Randall

Related Topics: Containers Expo Blog, Java IoT, Linux Containers, Open Source Cloud, @CloudExpo, SDN Journal

Containers Expo Blog: Article

Preserving Storage I/O for Critical Applications

Is hypervisor side I/O throttling enough?

While server virtualization has unprecedented advantages, it has thrown some challenges as well. Prior to server virtualization, each application used to run on a dedicated server with dedicated storage connected through dedicated switch ports. Needless to say, this approach had multiple limitations in terms of lack of flexibility, underutilization, etc. But, it provided guaranteed hardware resources such as CPU, network, memory, and storage bandwidth for each application.

With server virtualization, the underlying hardware resources are shared. Hypervisors do a good job in terms of providing guaranteed CPU power, network bandwidth, and memory for each application that shares these resources. However, when it comes to storage, things get complicated. Hypervisors don't have access to resources running inside the storage arrays and can only control access to storage through I/O throttling. This is not enough to preserve the storage I/O for critical applications. As a result, critical applications stay sidelined and the old school method of dedicated hardware approach prevails.

Many factors contribute to the complexity of storage even though the general perception is that storage is just a piece of hardware. In reality, there is a complex piece of software running in the system utilizing various hardware components such as CPU, cache, disk, and network, inside the storage array. Incoming I/Os add various amounts of pressure on these resources based on the traffic pattern. For example, 100 IOPS from Oracle is quite different from 100 IOPS from MS Exchange as far as the storage array is concerned. They are different in terms of read/write ratio, block size, random sequencial, disk placements, deduplication, compression, and so on.

Irrespective of I/O throttling at the hypervisor level, customers often complain: "My applications were running fine till yesterday; but why this sluggish performance now? There can be many factors contributing to this situation. Let us examine some of the common ones:

  • A new application replaces the old application, consuming storage from the array. From the hypervisor point of view, both the new and old applications are sending the same amount of IOPS towards the storage array. But, the new application completely wipes out cache for all the other applications in the storage array since it accesses the same data again and again which keeps this cache hot while depleting other caches.
  • Some applications change the traffic patterns drastically. The classic example is that of Accounting applications that generate month-end or quarter-end reports, depleting the storage resource for other ones. The hypervisor doesn't differentiate between regular I/O and report generation I/O.
  • An application creates and deletes large files. As far as the hypervisor is concerned, that's just a few I/Os, but that strains the storage array resources heavily.
  • One volume is configured for more frequent snapshot creation and deletion; something that is completely inaccessible to the hypervisor.
  • An application that accesses historical data starts to dig data from passive tiers, transparent to the hypervisor, but adds more load on the storage arrays.
  • A filesystem in the storage array is aging and results in more fragmentation. This costs more storage resources to pull out the same amount of data as before. But the hypervisor has no clue about these aspects.
  • Storage arrays' in-house activities, such as garbage collection reduce the overall capability of the array. Here, a simple hypervisor I/O throttling doesn't guarantee the critical applications the required IOPS.
  • Storage arrays' component failures result in reduced capabilities. The hypervisor can't do much to maintain the I/O level for critical applications.

No. Hypervisor side I/O throttling is not just enough. Storage needs to be intelligent to deliver differential, consistent, guaranteed IOPS to each application or virtual machines sharing the storage array. To get end-to-end SLA, all the three layers of the data center such as hypervisor/server, network, and storage need to satisfy individual SLAs. One component cannot act on behalf of the other.

All these years of experience in networking and storage have taught me: storage is the most complex part when it comes to delivering SLA.

More Stories By Felix Xavier

Felix Xavier is Founder and CTO of CloudByte. He has more than 15 years of development and technology management experience. He has built many high-energy technology teams, re-architected products and developed features from scratch. Most recently, Felix helped NetApp gain leadership position in storage array-based data protection by driving innovations around its product suite. He has filed numerous patents with the US patent office around core storage technologies. Prior to this, Felix worked at Juniper, Novell and IBM, where he handled networking technologies, including LAN, WAN and security protocols and Intrusion Prevention Systems (IPS). Felix has master’s degrees in technology and business administration.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.

CloudEXPO Stories
Digital Transformation and Disruption, Amazon Style - What You Can Learn. Chris Kocher is a co-founder of Grey Heron, a management and strategic marketing consulting firm. He has 25+ years in both strategic and hands-on operating experience helping executives and investors build revenues and shareholder value. He has consulted with over 130 companies on innovating with new business models, product strategies and monetization. Chris has held management positions at HP and Symantec in addition to advisory roles at startups. He has worked extensively on monetization, SAAS, IoT, ecosystems, partnerships and accelerating growth in new business initiatives.
Whenever a new technology hits the high points of hype, everyone starts talking about it like it will solve all their business problems. Blockchain is one of those technologies. According to Gartner's latest report on the hype cycle of emerging technologies, blockchain has just passed the peak of their hype cycle curve. If you read the news articles about it, one would think it has taken over the technology world. No disruptive technology is without its challenges and potential impediments that frequently get lost in the hype. The panel will discuss their perspective on what they see as they key challenges and/or impediments to adoption, and how they see those issues could be resolved or mitigated.
Lori MacVittie is a subject matter expert on emerging technology responsible for outbound evangelism across F5's entire product suite. MacVittie has extensive development and technical architecture experience in both high-tech and enterprise organizations, in addition to network and systems administration expertise. Prior to joining F5, MacVittie was an award-winning technology editor at Network Computing Magazine where she evaluated and tested application-focused technologies including app security and encryption-related solutions. She holds a B.S. in Information and Computing Science from the University of Wisconsin at Green Bay, and an M.S. in Computer Science from Nova Southeastern University, and is an O'Reilly author.
DXWorldEXPO LLC announced today that Nutanix has been named "Platinum Sponsor" of CloudEXPO | DevOpsSUMMIT | DXWorldEXPO New York, which will take place November 12-13, 2018 in New York City. Nutanix makes infrastructure invisible, elevating IT to focus on the applications and services that power their business. The Nutanix Enterprise Cloud Platform blends web-scale engineering and consumer-grade design to natively converge server, storage, virtualization and networking into a resilient, software-defined solution with rich machine intelligence.
DXWorldEXPO LLC announced today that Big Data Federation to Exhibit at the 22nd International CloudEXPO, colocated with DevOpsSUMMIT and DXWorldEXPO, November 12-13, 2018 in New York City. Big Data Federation, Inc. develops and applies artificial intelligence to predict financial and economic events that matter. The company uncovers patterns and precise drivers of performance and outcomes with the aid of machine-learning algorithms, big data, and fundamental analysis. Their products are deployed by some of the world's largest financial institutions. The company develops and applies innovative machine-learning technologies to big data to predict financial, economic, and world events. The team is a group of passionate technologists, mathematicians, data scientists and programmers in Silicon Valley with over 100 patents to their names. Big Data Federation was incorporated in 2015 and is ...