Welcome!

SDN Journal Authors: Patrick Carey, Liz McMillan, RealWire News Distribution, Rakesh Shah, Elizabeth White

Related Topics: Cloud Expo, Java, SOA & WOA, Linux, Security, SDN Journal

Cloud Expo: Article

The Facts About Cloud High Availability and Disaster Recovery

Understanding the facts about HA and DR in the cloud can help you make informed decisions

Enterprises are moving more and more applications to the cloud. Gartner predicts that the bulk of new IT spending by 2016 will be for cloud computing platforms and applications and that nearly half of large enterprises will have cloud deployments by the end of 2017.1

The far-reaching impact of cloud computing is summarized in a recent McKinsey report on disruptive technologies: "Cloud technology has the potential to improve productivity across $3 trillion in global enterprise IT spending, as well as enabling the creation of new online products and services for billions of consumers and millions of businesses alike."2

For many organizations, moving applications that can tolerate brief periods of downtime to the cloud is a straightforward decision with clear benefits. However, concerns about how to provide high availability and disaster protection in the cloud may make this decision more difficult for business-critical applications such as SQL, SAP, and Exchange. Understanding the facts about HA and DR in the cloud can help you make informed decisions about moving applications to the cloud, while ensuring the important business operations that depend on them are protected from downtime and data loss.

Fact #1: You need high availability protection in a cloud.
Do not assume that your cloud environment provides high availability protection, unless you have specifically configured it for HA. In fact, according to a recent study: "The average unavailability of cloud services is 10 hours per year or more, while the average availability is estimated to be 99.9% far less than the expected availability of business critical applications."3 That is the equivalent of more than a day of downtime. In fact, in 2014, Microsoft Windows Azure, Google, and Amazon Web Services all had some measure of service interruptions or downtime ranging from 4 minutes to several hours.4

For business critical applications, the redundancy that you can get with some cloud solutions, such as Windows Azure, is not enough. When you consider the cost of a minute of downtime for applications, such as SQL Server, Oracle, and SAP that may run many of your key business processes, it becomes clear that you need true high availability and disaster recovery protection. You need to ensure that end users have immediate access to data and applications in the event of a local failure, a regional disaster or anything in between.

However, the traditional way of providing high availability protection is to build a cluster using two identical servers - a primary server and a standby server -  with shared (typically SAN) storage. If the primary server fails, the application operation is moved to the standby server, which has immediate access to the same storage. The problem is that SANs are not only expensive to buy, manage, and maintain, they are simply not an option in public cloud offerings. There are, however, high availability solutions that can be used in a cloud that do not require a SAN.

Fact #2: You can build a cluster in a cloud.
Even though you cannot have a SAN in a cloud, you can build a cluster for high availability protection. In a Windows cloud, you simply add SANLess cluster software to your Windows Server Failover Cluster (WSFC). The SANLess software uses real time, block level replication to keep local storage in two geographic regions of the cloud synchronized. If there is an outage, the application operation is automatically moved to the remote instance, which has immediate access to current data. The synchronized storage looks to the WSFC like a traditional shared storage so there is no added complexity or specialized skills needed to build or manage a SANLess cluster. In fact, a SANLess cluster is easy to manage and has the added benefit of eliminating the single point of failure risk of a SAN. SANLess clusters also provide complete configuration flexibility, allowing you to replicate between physical, virtual, cloud, and hybrid cloud environment as well as between SAN and SANLess clusters.

Fact #3: You can have geographically separated nodes for DR in a cloud.
While providing high availability within the cloud will protect you from normal hardware failures and other unexpected outages within an availability zone (Amazon) or fault domain (Azure), you still need to protect against regional disasters. The easiest solution is to configure a multisite (geographically separated) cluster.

One effective method is to build a SANLess cluster within a cloud and extend it for disaster recovery by adding another node(s) in an alternate data center or a different geographic region within the cloud. Unlike traditional clusters that require you to have identical hardware and software in every node, a SANLess cluster allows you to mix physical, cloud and hybrid cloud configurations. The benefits of a DR configuration are clear. For example, simply adding a third, geographically separated node to your SANLess cluster in a Windows Azure cloud can give you a recovery point objective (RPO) of near zero data loss and a recovery time objective (RTO) of just about one minute.

Fact #4: You can create a cluster that mixes cloud and on-premises nodes.
You can use your on-premises data center as your primary location with a failover cluster to provide high availability protection and use the cloud as your hot standby DR site. This is a very cost-effective alternative to building out your own DR site, or renting rack space in a business continuity facility. In this case, the on-premises servers can be your choice of traditional SAN-based clusters, SANLess clusters, or even single servers not currently participating in a cluster.

The objective of having a "hot" standby DR site is to have standby servers up and running as quickly as possible in the DR site with access to a copy of the most recent application data. In the event of a disaster, recovery is automatic and immediate. A multisite cluster is an effective way to implement a hot standby DR site. In this case, the SANLess date. In the event of a forecasted disaster, such as a storm or a flood, applications can be moved to the cloud before potential disaster strikes. In the event of an unexpected disaster, applications can be recovered manually or in some cases automatically, depending upon the quorum configuration. This mix of cloud and on-premises nodes gives you an excellent RTO and RPO with minimal investment in infrastructure.

Fact #5: HA and DR in a cloud can be easy and highly cost-effective.
If you choose a SANLess software that provides an intuitive configuration interface, you can create a standard WSFC in a cloud in minutes without specialized skills. A SANLess cluster can help you realize significant cost savings in several ways. First, in a Microsoft SQL Server environment a SANLess cluster can give you high availability with SQL Server Standard Edition software licenses without requiring you to upgrade to costly SQL Server Enterprise Edition.

Second, you can realize hundreds of thousands of dollars in savings with a SANLess by eliminating the total cost of ownership (TCO) associated with a SAN. The savings in TCO include the SAN hardware acquisition costs; the power, cooling, and data center floor space costs; and the ongoing labor cost of specialized SAN administration.

If you are thinking about moving your important applications to the cloud, you need to consider how you will protect those applications from downtime and data loss. While traditional SAN-based clusters are not possible in these environments, SANLess clusters can provide an easy, cost-efficient alternative. These clusters not only provide high availability protection, but also enable significantly greater configuration flexibility and potentially dramatic savings in both licensing costs and SAN TCO.

Notes

1"Gartner Says Cloud Computing Will Become the Bulk of New IT Spend by 2016."

2 Manyika, James and Michael Chui, et al, "Disruptive technologies: Advances that will transform life, business, and the global economy," McKinsey Global Institute (May 2013) 

3Whittaker, Josh, "Amazon Web Services Suffers Outage, Takes Out Vine, Instagram, Others with it," ZDNet, (August 26, 2013)

4Mackay, Martin, "Downtime Report: Top Ten Outages in 2013," Business2Community.com, (December 2013)

More Stories By Jerry Melnick

Jerry Melnick ([email protected]) is responsible for defining corporate strategy and operations at SIOS Technology Corp. (www.us.sios.com), maker of SIOS SAN and #SANLess cluster software (www.clustersyourway.com). He more than 25 years of experience in the enterprise and high availability software industries. He holds a Bachelor of Science degree from Beloit College with graduate work in Computer Engineering and Computer Science at Boston University.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


Cloud Expo Breaking News
Scott Jenson leads a project called The Physical Web within the Chrome team at Google. Project members are working to take the scalability and openness of the web and use it to talk to the exponentially exploding range of smart devices. Nearly every company today working on the IoT comes up with the same basic solution: use my server and you'll be fine. But if we really believe there will be trillions of these devices, that just can't scale. We need a system that is open a scalable and by using the URL as a basic building block, we open this up and get the same resilience that the web enjoys.
Cloud Computing is evolving into a Big Three of Amazon Web Services, Google Cloud, and Microsoft Azure. Cloud 360: Multi-Cloud Bootcamp, being held Nov 4–5, 2014, in conjunction with 15th Cloud Expo in Santa Clara, CA, delivers a real-world demonstration of how to deploy and configure a scalable and available web application on all three platforms. The Cloud 360 Bootcamp, led by Janakiram MSV, an analyst with Gigaom Research, is the first bootcamp that introduces the core concepts of Infrastructure as a Service (IaaS) based on the workings of the Big Three platforms – Amazon EC2, Google Compute Engine, and Azure VMs. Bootcamp attendees will get to see the big picture and also receive the knowledge needed to make the best cloud decisions for their business applications and entire enterprise IT organization.
The Internet of Things is a natural complement to the cloud and related technologies such as Big Data, analytics, and mobility. In his session at Internet of @ThingsExpo, Joe Weinman will lay out four generic strategies – digital disciplines – to exploit emerging digital technologies for strategic advantage. Joe Weinman has held executive leadership positions at Bell Labs, AT&T, Hewlett-Packard, and Telx, in areas such as corporate strategy, business development, product management, operations, and R&D.
SYS-CON Events announced today that DevOps.com has been named “Media Sponsor” of SYS-CON's “DevOps Summit at Cloud Expo,” which will take place on June 10–12, 2014, at the Javits Center in New York City, New York. DevOps.com is where the world meets DevOps. It is the largest collection of original content relating to DevOps on the web today Featuring up-to-the-minute news, feature stories, blogs, bylined articles and more, DevOps.com is where the thought leaders of the DevOps movement make their ideas known.
There are 182 billion emails sent every day, generating a lot of data about how recipients and ISPs respond. Many marketers take a more-is-better approach to stats, preferring to have the ability to slice and dice their email lists based numerous arbitrary stats. However, fundamentally what really matters is whether or not sending an email to a particular recipient will generate value. Data Scientists can design high-level insights such as engagement prediction models and content clusters that allow marketers to cut through the noise and design their campaigns around strong, predictive signals, rather than arbitrary statistics. SendGrid sends up to half a billion emails a day for customers such as Pinterest and GitHub. All this email adds up to more text than produced in the entire twitterverse. We track events like clicks, opens and deliveries to help improve deliverability for our customers – adding up to over 50 billion useful events every month. While SendGrid data covers only abo...
SYS-CON Events announced today that the Web Host Industry Review has been named “Media Sponsor” of SYS-CON's 15th International Cloud Expo®, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Since 2000, The Web Host Industry Review has made a name for itself as the foremost authority of the Web hosting industry providing reliable, insightful and comprehensive news, reviews and resources to the hosting community. TheWHIR Blogs provides a community of expert industry perspectives. The Web Host Industry Review Magazine also offers a business-minded, issue-driven perspective of interest to executives and decision-makers. WHIR TV offers on demand web hosting video interviews and web hosting video features of the key persons and events of the web hosting industry. WHIR Events brings together like-minded hosting industry professionals and decision-makers in local communities. TheWHIR is an iNET Interactive property.
SYS-CON Events announced today that O'Reilly Media has been named “Media Sponsor” of SYS-CON's 15th International Cloud Expo®, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. O'Reilly Media spreads the knowledge of innovators through its books, online services, magazines, and conferences. Since 1978, O'Reilly Media has been a chronicler and catalyst of cutting-edge development, homing in on the technology trends that really matter and spurring their adoption by amplifying "faint signals" from the alpha geeks who are creating the future. An active participant in the technology community, the company has a long history of advocacy, meme-making, and evangelism.
SYS-CON Events announced today that Verizon has been named “Gold Sponsor” of SYS-CON's 15th International Cloud Expo®, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Verizon Enterprise Solutions creates global connections that generate growth, drive business innovation and move society forward. With industry-specific solutions and a full range of global wholesale offerings provided over the company's secure mobility, cloud, strategic networking and advanced communications platforms, Verizon Enterprise Solutions helps open new opportunities around the world for innovation, investment and business transformation. Visit verizonenterprise.com to learn more.
SYS-CON Events announced today that TMCnet has been named “Media Sponsor” of SYS-CON's 15th International Cloud Expo®, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Technology Marketing Corporation (TMC) is the world's leading business to business and integrated marketing media company, servicing niche markets within the communications and technology industries.
"In my session I spoke about enterprise cloud analytics and how we can leverage analytics as a service," explained Ajay Budhraja, CTO at the Department of Justice, in this SYS-CON.tv interview at the 14th International Cloud Expo®, held June 10-12, 2014, at the Javits Center in New York City. Cloud Expo® 2014 Silicon Valley, November 4–6, at the Santa Clara Convention Center in Santa Clara, CA, will feature technical sessions from a rock star conference faculty and the leading Cloud industry players in the world.
“We are starting to see people move beyond the commodity cloud and enterprises need to start focusing on additional value added services in order to really drive their adoption," explained Jason Mondanaro, Director of Product Management at MetraTech, in this SYS-CON.tv interview at the 14th International Cloud Expo®, held June 10-12, 2014, at the Javits Center in New York City. Cloud Expo® 2014 Silicon Valley, November 4–6, at the Santa Clara Convention Center in Santa Clara, CA, will feature technical sessions from a rock star conference faculty and the leading Cloud industry players in the world.
"We are automated capacity control software, which basically looks at all the supply and demand and running a virtual cloud environment and does a deep analysis of that and says where should things go," explained Andrew Hillier, Co-founder & CTO of CiRBA, in this SYS-CON.tv interview at the 14th International Cloud Expo®, held June 10-12, 2014, at the Javits Center in New York City. Cloud Expo® 2014 Silicon Valley, November 4–6, at the Santa Clara Convention Center in Santa Clara, CA, will feature technical sessions from a rock star conference faculty and the leading Cloud industry players in the world.
Almost everyone sees the potential of Internet of Things but how can businesses truly unlock that potential. The key will be in the ability to discover business insight in the midst of an ocean of Big Data generated from billions of embedded devices via Systems of Discover. Businesses will also need to ensure that they can sustain that insight by leveraging the cloud for global reach, scale and elasticity. In his session at Internet of @ThingsExpo, Mac Devine, Distinguished Engineer at IBM, will discuss bringing these three elements together via Systems of Discover.
The Internet of Things promises to transform businesses (and lives), but navigating the business and technical path to success can be difficult to understand. In his session at 15th Internet of @ThingsExpo, Chad Jones, Vice President, Product Strategy of LogMeIn's Xively IoT Platform, will show you how to approach creating broadly successful connected customer solutions using real world business transformation studies including New England BioLabs and more.
All too many discussions about DevOps conclude that the solution is an all-purpose player: developer and operations guru, complete with pager for round-the-clock duty. For most organizations that is not the way forward. In his session at DevOps Summit, Bernard Golden, Vice President of Strategy at ActiveState, will discuss how to achieve the agility and speed of end-to-end automation without requiring an organization stocked with Supermen and Superwomen.