Welcome!

SDN Journal Authors: Pat Romanski, Patrick Hubbard, Elizabeth White, Sven Olav Lund, Liz McMillan

Related Topics: @CloudExpo, Microservices Expo, Containers Expo Blog, Cloud Security, @BigDataExpo, SDN Journal

@CloudExpo: Article

Cloud Hardware and New Memory Controller Designs

An exclusive interview with Barbara P. Aichinger, co-founder of FuturePlus Systems and VP of New Business Development

"The Data Center operators do understand that quality does matter," noted Barbara P. Aichinger, co-founder of FuturePlus Systems and VP of New Business Development, in this exclusive interview with Cloud Expo Conference Chair Jeremy Geelan. "When they experience failures they call the supplier and the Tier 2 and 3 vendors just blame somebody else, like the DIMM vendor or the software."

Cloud Computing Journal: You seem to have some concerns about the actual cloud hardware can you explain?

Barbara P. Aichinger: Sure, my company FuturePlus Systems makes memory design validation equipment used by the engineers that design cloud hardware. These server and network equipment have technology standards that govern their design. The advantage of using standards is that you can buy one part from vendor A and another from vendor B and because they are all designed to the same standard they work together. The standards organizations that write these standards are international in nature and in most cases have a Compliance Standard associated with the technology standard. Vendors have to not only obey the standard itself but pass a test, specified by the compliance portion of the standard, that proves that their design meets the specification. This is a stamp of quality and interoperability. The problem we have today with cloud hardware is that at the very heart of all of this hardware is the JEDEC DDR Memory standard but this standard has no compliance specification per se. Thus there is no third party checking this very critical portion of the design for quality and compliance.

Cloud Computing Journal: Why is it that there is no compliance standard for DDR Memory?

Aichinger: Good Question. Last May (2013) at a JEDEC Conference (JEDEC is the international standards organization that governs the DDR Memory specification) I asked that very question. The answer was a shrug of the shoulders and a response of ‘well we all work so closely together so we did not need one'. This was probably ok 6 or 7 years ago when the server market was dominated by a few large vendors. In addition the memory controllers themselves only came from two major silicon vendors. In addition proving compliance was very difficult and only the large major players could afford the equipment to perform such an analysis. However now there are lots of vendors supplying cloud hardware and new memory controller designs by smaller vendors starting to proliferate the market. As such we see memory error rates in the data center accelerating.

Cloud Computing Journal: How big is the problem?

Aichinger: Google, having one of the largest data centers in the world, has definitely noticed the problem. They have worked with several in academia studying the problem. Two main works have resulted: DRAM Errors in the Wild: A Large-Scale Field Study and Cosmic Rays Don't Strike Twice: Understanding the Nature of DRAM Errors and the Implications for System Design

At the Open Compute Project conference in January 2013 Facebook said that DDR Memory failures were the #2 failure in the data center. The data rates are not trivial. Given the growth that we see in data centers we are seeing memory failures bring down servers hourly. This is not only a cost in down time but also in labor to replace the system or the failing DIMM.

We have also heard the phrase ‘ghost errors'. This is when the server will go down experiencing a hard memory failure. The operators run all sorts of diagnostics and they find no error, everything works fine. They boot back up and the system will continue to run for perhaps several weeks before it experiences another error. Because they can never find the failure as they seem to disappear they call them ‘ghost errors'.

Cloud Computing Journal: How are Data Centers responding?

Aichinger: They are doing a lot of head scratching. They have cost pressures and quality concerns. From what we have been told there is a push to commoditize the server market. That is to have no distinction between the Tier 1 and the lower Tier 2 or Tier 3 vendors. The Data Center operators do understand that quality does matter. When they experience failures they call the supplier and the Tier 2 and 3 vendors just blame somebody else, like the DIMM vendor or the software. We have seen all sorts of finger pointing. Even the DIMM connector vendors get blamed even though there is really no proof behind the claim. The Tier 1 vendors will often try to study the problem. They will bring the machine back to their facility and try to recreate the problem. One of our Tier one customers told us that only 30% of the time can they recreate the failure.

Cloud Computing Journal: What is the answer here? Can we have low cost and high quality?

Aichinger: I think we can. The first step would be for the customers to demand qualification of the memory subsystem. This is what we at FuturePlus are trying to do. We are trying to alert the end user to the problem. The suppliers of this hardware are more than likely going to take the easy way out and not validate their designs. Oftentimes you have system integrators who have no idea where the motherboard or the memory came from and can't even tell you what speed the memory is operating at. The companies that run these data centers are going to have to come up to speed on basic computer architecture so they don't get the wool pulled over their eyes when buying this hardware.

More Stories By Liz McMillan

News Desk compiles and publishes breaking news stories, press releases and latest news articles as they happen.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@CloudExpo Stories
SYS-CON Events announced today that mruby Forum will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. mruby is the lightweight implementation of the Ruby language. We introduce mruby and the mruby IoT framework that enhances development productivity. For more information, visit http://forum.mruby.org/.
Though cloud is the future of enterprise computing, a smooth transition of legacy applications and systems is critical for seamless business operations. IT professionals are eager to start leveraging the cost, scale and other benefits of cloud, but with massive investments already in place in existing infrastructure and a number of compliance and resource hurdles, it can be challenging to move to a cloud-based infrastructure.
Digital transformation is changing the face of business. The IDC predicts that enterprises will commit to a massive new scale of digital transformation, to stake out leadership positions in the "digital transformation economy." Accordingly, attendees at the upcoming Cloud Expo | @ThingsExpo at the Santa Clara Convention Center in Santa Clara, CA, Oct 31-Nov 2, will find fresh new content in a new track called Enterprise Cloud & Digital Transformation.
Smart cities have the potential to change our lives at so many levels for citizens: less pollution, reduced parking obstacles, better health, education and more energy savings. Real-time data streaming and the Internet of Things (IoT) possess the power to turn this vision into a reality. However, most organizations today are building their data infrastructure to focus solely on addressing immediate business needs vs. a platform capable of quickly adapting emerging technologies to address future ...
Amazon is pursuing new markets and disrupting industries at an incredible pace. Almost every industry seems to be in its crosshairs. Companies and industries that once thought they were safe are now worried about being “Amazoned.”. The new watch word should be “Be afraid. Be very afraid.” In his session 21st Cloud Expo, Chris Kocher, a co-founder of Grey Heron, will address questions such as: What new areas is Amazon disrupting? How are they doing this? Where are they likely to go? What are th...
SYS-CON Events announced today that NetApp has been named “Bronze Sponsor” of SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. NetApp is the data authority for hybrid cloud. NetApp provides a full range of hybrid cloud data services that simplify management of applications and data across cloud and on-premises environments to accelerate digital transformation. Together with their partners, NetApp emp...
The dynamic nature of the cloud means that change is a constant when it comes to modern cloud-based infrastructure. Delivering modern applications to end users, therefore, is a constantly shifting challenge. Delivery automation helps IT Ops teams ensure that apps are providing an optimal end user experience over hybrid-cloud and multi-cloud environments, no matter what the current state of the infrastructure is. To employ a delivery automation strategy that reflects your business rules, making r...
Most technology leaders, contemporary and from the hardware era, are reshaping their businesses to do software. They hope to capture value from emerging technologies such as IoT, SDN, and AI. Ultimately, irrespective of the vertical, it is about deriving value from independent software applications participating in an ecosystem as one comprehensive solution. In his session at @ThingsExpo, Kausik Sridhar, founder and CTO of Pulzze Systems, will discuss how given the magnitude of today's applicati...
Join IBM November 1 at 21st Cloud Expo at the Santa Clara Convention Center in Santa Clara, CA, and learn how IBM Watson can bring cognitive services and AI to intelligent, unmanned systems. Cognitive analysis impacts today’s systems with unparalleled ability that were previously available only to manned, back-end operations. Thanks to cloud processing, IBM Watson can bring cognitive services and AI to intelligent, unmanned systems. Imagine a robot vacuum that becomes your personal assistant th...
SYS-CON Events announced today that SkyScale will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. SkyScale is a world-class provider of cloud-based, ultra-fast multi-GPU hardware platforms for lease to customers desiring the fastest performance available as a service anywhere in the world. SkyScale builds, configures, and manages dedicated systems strategically located in maximum-security...
In his general session at 21st Cloud Expo, Greg Dumas, Calligo’s Vice President and G.M. of US operations, will go over the new Global Data Protection Regulation and how Calligo can help business stay compliant in digitally globalized world. Greg Dumas is Calligo's Vice President and G.M. of US operations. Calligo is an established service provider that provides an innovative platform for trusted cloud solutions. Calligo’s customers are typically most concerned about GDPR compliance, applicatio...
Enterprises are adopting Kubernetes to accelerate the development and the delivery of cloud-native applications. However, sharing a Kubernetes cluster between members of the same team can be challenging. And, sharing clusters across multiple teams is even harder. Kubernetes offers several constructs to help implement segmentation and isolation. However, these primitives can be complex to understand and apply. As a result, it’s becoming common for enterprises to end up with several clusters. Thi...
Microsoft Azure Container Services can be used for container deployment in a variety of ways including support for Orchestrators like Kubernetes, Docker Swarm and Mesos. However, the abstraction for app development that support application self-healing, scaling and so on may not be at the right level. Helm and Draft makes this a lot easier. In this primarily demo-driven session at @DevOpsSummit at 21st Cloud Expo, Raghavan "Rags" Srinivas, a Cloud Solutions Architect/Evangelist at Microsoft, wi...
Containers are rapidly finding their way into enterprise data centers, but change is difficult. How do enterprises transform their architecture with technologies like containers without losing the reliable components of their current solutions? In his session at @DevOpsSummit at 21st Cloud Expo, Tony Campbell, Director, Educational Services at CoreOS, will explore the challenges organizations are facing today as they move to containers and go over how Kubernetes applications can deploy with lega...
SYS-CON Events announced today that Avere Systems, a leading provider of hybrid cloud enablement solutions, will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Avere Systems was created by file systems experts determined to reinvent storage by changing the way enterprises thought about and bought storage resources. With decades of experience behind the company’s founders, Avere got its ...
SYS-CON Events announced today that Taica will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. ANSeeN are the measurement electronics maker for X-ray and Gamma-ray and Neutron measurement equipment such as spectrometers, pulse shape analyzer, and CdTe-FPD. For more information, visit http://anseen.com/.
Today most companies are adopting or evaluating container technology - Docker in particular - to speed up application deployment, drive down cost, ease management and make application delivery more flexible overall. As with most new architectures, this dream takes significant work to become a reality. Even when you do get your application componentized enough and packaged properly, there are still challenges for DevOps teams to making the shift to continuous delivery and achieving that reducti...
As you move to the cloud, your network should be efficient, secure, and easy to manage. An enterprise adopting a hybrid or public cloud needs systems and tools that provide: Agility: ability to deliver applications and services faster, even in complex hybrid environments Easier manageability: enable reliable connectivity with complete oversight as the data center network evolves Greater efficiency: eliminate wasted effort while reducing errors and optimize asset utilization Security: imple...
High-velocity engineering teams are applying not only continuous delivery processes, but also lessons in experimentation from established leaders like Amazon, Netflix, and Facebook. These companies have made experimentation a foundation for their release processes, allowing them to try out major feature releases and redesigns within smaller groups before making them broadly available. In his session at 21st Cloud Expo, Brian Lucas, Senior Staff Engineer at Optimizely, will discuss how by using...
In this strange new world where more and more power is drawn from business technology, companies are effectively straddling two paths on the road to innovation and transformation into digital enterprises. The first path is the heritage trail – with “legacy” technology forming the background. Here, extant technologies are transformed by core IT teams to provide more API-driven approaches. Legacy systems can restrict companies that are transitioning into digital enterprises. To truly become a lead...