|By Daniel Thompson||
|March 14, 2013 12:00 PM EDT||
I came across an article discussing NoSQL and partition tolerance.
The NoSQL Partition Tolerance Myth (link)
I may not entirely agree with the author.
But what most NoSQL systems offer is a peculiar behavior that is not partition tolerant, but partition oblivious instead.
No argument here. While NoSQL implementations are aware that nodes have left, they are not aware that said nodes have formed a separate partition.
In this case, we would want failure detection and carry out those transfers where the accounts are both on the same side of the partition, while denying or deferring transfers that cross the chasm.
The author is assuming that the account is only on one side of the partition. If that is that is case, is doesn’t matter whether the NoSQL implementation is eventually consistent or not. If the account is on both sides of the partition, the solution the author provides still results in an inconsistent state.
In such cases, it is almost always better to build services that degrade gracefully under partitions.
Bingo! The author implies that all NoSQL implementations sacrifice consistency in the event of a partition. That is not true. There are AP implementations (available & partition tolerant), and there are CP implementations (consistent & partition tolerant). However, an AP implementation can function as a CP implementation depending on the configuration and the application. For example, if accounts do not have multiple owners and the application can not withdraw funds from or deposit funds to an account if it can not access the account.
This post is not perfect and it is a bit outdated, but I think its helpful nonetheless.
Visual Guide to NoSQL Systems (link)
If I were to build a bank based on Dynamo, the granddaddy of all first-generation NoSQL data stores, it would silently split into two halves, like a lobotomized patient.
I would not say that Amazon Dynamo is the grandparent of all NoSQL implementations. I would say that there are two parents: Amazon Dynamo and Google BigTable. Then there are the grandparents…
A Brief History of NoSQL (link)
In this scenario, the hypothetical backend for Banko Dynamo would not only not provide any indication of failure, but allow a customer to create as many new accounts as there are partitions, one in each.
Why is the author now using account creation instead of withdrawals and deposits, and what is the relevance of creating multiple accounts? If my debit card does not work, I do not create a new account. That, and I maintain two checking accounts and one savings account with the same bank.
Let’s go back to withdrawals and deposits. If the accounts do not have multiple owners, it does not matter whether the NoSQL implementation is eventually consistent or not. If the accounts do have multiple owners, it depends on the NoSQL implementation. If it is inspired by Google BigTable (e.g. Apache HBase) or both Google BigTable and Amazon Dynamo (e.g. Apache Cassandra), it does not matter. These NoSQL implementations are CP, or can be configured to be CP. If it inspired only by Amazon Dynamo and it is eventually consistent, it may or may not matter…
Let’s assume that account withdrawals / deposits are separate from the accounts themselves and that the account is both consistent and available during a partition. The account has multiple owners but it is more or less read only.
My account has a balance of $100 (calculated from the withdrawals and deposits). Now, there are two partitions: A and B. I purchase $50 of St. Bernardus Abt 12 at Binny’s via partition A. Partition A now has withdrawal #1. I have dinner at Baume & Brix for $75 via partition B. Partition B now has withdrawal #2. My account has a balance of $50 in partition A. It has a balance of $25 in partition B. My account should have a balance of minus $25.
Does it matter? My account may not have a balance of minus $25, but it will. When the partition is repaired, the application will be able to access all of the withdrawals and deposits on my account. I may be charged an overlimit fee.
What if the NoSQL implementation sacrificed availability? My payment at Binny’s did not go through. That’s not a problem. No St. Bernardus Abt 12 for me. My payment at Baume & Brix did not go through. That’s a problem. I can’t pay for dinner. Baume & Brix can’t accept my payment nor that of any other customer paying with a debit card from the same bank as me via partition B.
What if I made a deposit of $25 at an ATM via partition A? My account will have a balance of $0 after the partition is repaired. I will not be charged an overlimit fee.
There are other scenarios. Perhaps I’m charged an insufficient funds fee and Baume & Brix does not receive payment. Perhaps Baume & Brix later resubmits the payment and receives payment.
Do you really want to sell tickets from both halves of your system? By definition, there is no way you can guarantee uniqueness of those tickets. There will be customers holding identical tickets with identical seat numbers.
Maybe, maybe not. If there is only a single owner per ticket, then yes. However, there may be availability issues. For example, partition A has tickets 1-150 and partition B has tickets 151-200. If all the tickets in partition B have been purchased, visitors may be unable to purchase tickets despite the fact that there may be tickets available in partition A. If there are multiple owners per ticket, I would prefer a NoSQL implementation that is CP. In this case, I would prefer to sacrifice availability rather than consistency.
Here is a better example. What if I report my debit card stolen? Sacrificing availability is not appropriate. What if customer service is accessing my account via the partition with no availability? My debit card must be reported stolen or the thief can continue to make purchases with it. Sacrificing consistency is not appropriate. The thief can continue to make purchases with my debit card via the partition where my account has not been reported stolen. Perhaps account information should not be stored in a distributed system.
And if they did, the first-generation NoSQL stores usually take the ultimate punt by presenting all versions of the divergent objects to the application, and let the application resolve the mess.
No argument here.
But if your data is that soft and inconsequential, why not just use memcached? It’s wicked fast, far faster than Mongo.
Perhaps because MongoDB is a document store and as such provides features that are not provided by key / value stores.
A lot of NoSQL developers pretend that being partition oblivious is a difficult thing to implement. This is false. It’s easy to make a program oblivious to a particular event; namely, you write no code to handle that event.
No argument here.
The thing that greatly helps first generation NoSQL data stores, the thing that enables them to package partition obliviousness as if it were equivalent to partition tolerance, is that they provide a very weak service guarantee in the first place. These systems cannot guarantee that, on a good day, your GET will return the latest PUT.
Sure they can.
In fact, eventual consistency means that a GET can return any previous value, including the Does Not Exist response from the very initial state of the system.
No argument here. Of course, not all NoSQL implementations are eventually consistent.
With all this being said, a NoSQL implementation may or may not be appropriate. To be more specific, a NoSQL implementation that is eventually consistent and sacrifices consistency in the event of a partition may or may not be appropriate. The behaviour is determined by the NoSQL implementation, its configuration, and the application that reads and writes to it. Whether that behaviour is appropriate or not depends on the business requirements.
The revocation of Safe Harbor has radically affected data sovereignty strategy in the cloud. In his session at 17th Cloud Expo, Jeff Miller, Product Management at Cavirin Systems, discussed how to assess these changes across your own cloud strategy, and how you can mitigate risks previously covered under the agreement.
Dec. 2, 2015 12:00 AM EST Reads: 138
In his General Session at DevOps Summit, Asaf Yigal, Co-Founder & VP of Product at Logz.io, explored the value of Kibana 4 for log analysis and provided a hands-on tutorial on how to set up Kibana 4 and get the most out of Apache log files. He examined three use cases: IT operations, business intelligence, and security and compliance. Asaf Yigal is co-founder and VP of Product at log analytics software company Logz.io. In the past, he was co-founder of social-trading platform Currensee, which...
Dec. 1, 2015 11:00 PM EST Reads: 306
SYS-CON Events announced today that Catchpoint, a global leader in monitoring, and testing the performance of online applications, has been named "Silver Sponsor" of DevOps Summit New York, which will take place on June 7-9, 2016 at the Javits Center in New York City. Catchpoint radically transforms the way businesses manage, monitor, and test the performance of online applications. Truly understand and improve user experience with clear visibility into complex, distributed online systems.Founde...
Dec. 1, 2015 10:15 PM EST Reads: 138
Today air travel is a minefield of delays, hassles and customer disappointment. Airlines struggle to revitalize the experience. GE and M2Mi will demonstrate practical examples of how IoT solutions are helping airlines bring back personalization, reduce trip time and improve reliability. In their session at @ThingsExpo, Shyam Varan Nath, Principal Architect with GE, and Dr. Sarah Cooper, M2Mi’s VP Business Development and Engineering, explored the IoT cloud-based platform technologies driving t...
Dec. 1, 2015 10:00 PM EST Reads: 477
"eFolder does a lot of different things but we protect data and we are focused on protecting data no matter where it resides," explained Carlo Tapia, Product Marketing Manager at eFolder, in this SYS-CON.tv interview at Cloud Expo, held November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA.
Dec. 1, 2015 04:00 PM EST
The Internet of Things (IoT) is growing rapidly by extending current technologies, products and networks. By 2020, Cisco estimates there will be 50 billion connected devices. Gartner has forecast revenues of over $300 billion, just to IoT suppliers. Now is the time to figure out how you’ll make money – not just create innovative products. With hundreds of new products and companies jumping into the IoT fray every month, there’s no shortage of innovation. Despite this, McKinsey/VisionMobile data...
Dec. 1, 2015 04:00 PM EST Reads: 507
Cloud computing is unquestionably one of the driving forces of DevOps, as the automation of operations transforms enterprise software development. DevOps, however, is more than a technology trend, as it represents a move toward silo-busting, self-organizing horizontal teams that drive business velocity. At the same time, enterprise Digital Transformation represents an upheaval across the enterprise, as customer preferences and behavior drive enterprise technology decisions. This transformation ...
Dec. 1, 2015 03:45 PM EST
Most of the IoT Gateway scenarios involve collecting data from machines/processing and pushing data upstream to cloud for further analytics. The gateway hardware varies from Raspberry Pi to Industrial PCs. The document states the process of allowing deploying polyglot data pipelining software with the clear notion of supporting immutability. In his session at @ThingsExpo, Shashank Jain, a development architect for SAP Labs, discussed the objective, which is to automate the IoT deployment proces...
Dec. 1, 2015 03:00 PM EST Reads: 152
Just over a week ago I received a long and loud sustained applause for a presentation I delivered at this year’s Cloud Expo in Santa Clara. I was extremely pleased with the turnout and had some very good conversations with many of the attendees. Over the next few days I had many more meaningful conversations and was not only happy with the results but also learned a few new things. Here is everything I learned in those three days distilled into three short points.
Dec. 1, 2015 03:00 PM EST Reads: 389
In demand-intensive mobile and web applications, an emerging pattern is to host the Systems of Engagement in the cloud (for maximum responsiveness) but keep the Systems of Record with the other important business systems in the company datacenter, often on a tightly secured mainframe. But what about the space in between? In this IBM Redpaper publication, we show that the IBM Bluemix cloud platform offers technologies that make it easy for cloud-based SoEs to securely connect to on-premises IBM...
Dec. 1, 2015 02:45 PM EST
DevOps is about increasing efficiency, but nothing is more inefficient than building the same application twice. However, this is a routine occurrence with enterprise applications that need both a rich desktop web interface and strong mobile support. With recent technological advances from Isomorphic Software and others, rich desktop and tuned mobile experiences can now be created with a single codebase – without compromising functionality, performance or usability. In his session at DevOps Su...
Dec. 1, 2015 02:45 PM EST Reads: 450
As organizations realize the scope of the Internet of Things, gaining key insights from Big Data, through the use of advanced analytics, becomes crucial. However, IoT also creates the need for petabyte scale storage of data from millions of devices. A new type of Storage is required which seamlessly integrates robust data analytics with massive scale. These storage systems will act as “smart systems” provide in-place analytics that speed discovery and enable businesses to quickly derive meaningf...
Dec. 1, 2015 02:15 PM EST Reads: 455
In his keynote at @ThingsExpo, Chris Matthieu, Director of IoT Engineering at Citrix and co-founder and CTO of Octoblu, focused on building an IoT platform and company. He provided a behind-the-scenes look at Octoblu’s platform, business, and pivots along the way (including the Citrix acquisition of Octoblu).
Dec. 1, 2015 02:00 PM EST Reads: 553
In his General Session at 17th Cloud Expo, Bruce Swann, Senior Product Marketing Manager for Adobe Campaign, explored the key ingredients of cross-channel marketing in a digital world. Learn how the Adobe Marketing Cloud can help marketers embrace opportunities for personalized, relevant and real-time customer engagement across offline (direct mail, point of sale, call center) and digital (email, website, SMS, mobile apps, social networks, connected objects).
Dec. 1, 2015 01:45 PM EST Reads: 359
The buzz continues for cloud, data analytics and the Internet of Things (IoT) and their collective impact across all industries. But a new conversation is emerging - how do companies use industry disruption and technology enablers to lead in markets undergoing change, uncertainty and ambiguity? Organizations of all sizes need to evolve and transform, often under massive pressure, as industry lines blur and merge and traditional business models are assaulted and turned upside down. In this new da...
Dec. 1, 2015 01:15 PM EST Reads: 313
In recent years, at least 40% of companies using cloud applications have experienced data loss. One of the best prevention against cloud data loss is backing up your cloud data. In his General Session at 17th Cloud Expo, Sam McIntyre, Partner Enablement Specialist at eFolder, presented how organizations can use eFolder Cloudfinder to automate backups of cloud application data. He also demonstrated how easy it is to search and restore cloud application data using Cloudfinder.
Dec. 1, 2015 12:00 PM EST Reads: 235
With all the incredible momentum behind the Internet of Things (IoT) industry, it is easy to forget that not a single CEO wakes up and wonders if “my IoT is broken.” What they wonder is if they are making the right decisions to do all they can to increase revenue, decrease costs, and improve customer experience – effectively the same challenges they have always had in growing their business. The exciting thing about the IoT industry is now these decisions can be better, faster, and smarter. Now ...
Dec. 1, 2015 12:00 PM EST Reads: 314
The Internet of Everything is re-shaping technology trends–moving away from “request/response” architecture to an “always-on” Streaming Web where data is in constant motion and secure, reliable communication is an absolute necessity. As more and more THINGS go online, the challenges that developers will need to address will only increase exponentially. In his session at @ThingsExpo, Todd Greene, Founder & CEO of PubNub, exploreed the current state of IoT connectivity and review key trends and t...
Dec. 1, 2015 11:45 AM EST Reads: 481
Actifio is powering new application development and testing services from Net3 Technologies (N3T), a managed cloud services provider. N3T's new Symmetry DevOps™ service builds on its existing Palmetto Virtual Data Center (PvDC) Cloud services for data backup and disaster recovery (DR) based on the Actifio Copy Data Virtualization platform. Previously, N3T's data protection and DR services were challenged by overlapping and inefficient legacy hardware and software platforms from multiple vendo...
Dec. 1, 2015 11:30 AM EST
Countless business models have spawned from the IaaS industry – resell Web hosting, blogs, public cloud, and on and on. With the overwhelming amount of tools available to us, it's sometimes easy to overlook that many of them are just new skins of resources we've had for a long time. In his general session at 17th Cloud Expo, Harold Hannon, Sr. Software Architect at SoftLayer, an IBM Company, broke down what we have to work with, discussed the benefits and pitfalls and how we can best use them ...
Dec. 1, 2015 10:45 AM EST Reads: 142