The non-existence of ACID consistency - Part 2
In the previous post of this blog series, we discussed why strong consistency across multiple distributed data nodes is a bad idea. In this post, we will discuss why the “strong consistency is needed for business reasons” requirement is void.
ACID consistency does not exist outside IT
If you explain why strong consistency across system boundaries is a bad idea, you often encounter the “but it is needed for business reasons” argument. There would be functional requirements why strong consistency across system boundaries is needed.
My response to that requirement is that such a requirement cannot exist because:
In the world outside IT, strong (ACID-like) consistency does not exist.
In the best case you find eventual consistency.
Usually, you do not find any consistency guarantees at all. The world sometimes is accidentally consistent, but most of the time it is just inconsistent.
Hence, I say there cannot be a requirement for strong consistency due to “business reasons”.
In the “business world”, the world outside IT systems, strong consistency simply does not exist. All processes, activities and interactions are eventually consistent at best. A requirement for eventual consistency due to “business reasons” thus can make sense. A requirement for strong consistency does not.
After explaining this, you often see the people thinking intensely. And almost inevitably someone comes up with the money transfer example, saying: “But for money transfer you need strong consistency. This is an example of functionality requiring strong consistency. I proved you wrong!”
In such a situation (which is basically every single time I have the consistency discussion), I reply:
“Okay, let us have a look at how money transfer works outside of IT. Let us assume, I transfer 50 EUR to your account. I press the transfer button and the money vanishes in a fraction of a second from my account. Money gone.
On your account: Nothing! We wait for a while: Still nothing! We wait a bit longer: Still nothing! And so on.
Usually, after a while – often a day or two, sometimes even longer – the money eventually appears on your account.
I also had situations where the money did not eventually appear on the recipient’s account. Then I needed to call the bank. They said they will take care of it, and eventually – usually some days later – the money magically reappeared on my account. I do not know where the money was in the meantime. I never found out.
Thus, from a user perspective, money transfer is eventually consistent. Sometimes, it even is just inconsistent. No strong consistency to be found.”
This little example illustrates why I say that strong consistency is a concept that solely exists inside IT systems. From a user’s perspective living outside IT systems, strong consistency does not exist. At best, eventual consistency exist. Often there are no consistency guarantees at all and you need to keep track yourself of if something went wrong (and trigger correcting actions).
But even with the money transfer example debunked, the advocates of strong consistency usually are not content: “Well, maybe it is not strongly consistent from a user’s perspective but we most definitely need it in our IT systems. It is totally unacceptable if systems show a different state after data has been updated in any of them.”
The question remains if this is actually a business need. Or it it just a convenience demand because strong consistency simplifies reasoning about state a lot. We will discuss this aspect of strong consistency in detail in the next post.
Often people insist that it is a business need. A typical demand in such discussions is that users must always see the current state of any piece of data, no matter which system they use to access it. Thus, ACID transactions across system boundaries would be needed.
In those discussions I ask the people how they make sure the data the users see on their screens is always up to date. Let us imagine the following scenario:
- A clerk gets called by a customer. The customer needs some information.
- The clerk retrieves the data needed to answer the question and gets it displayed on his/her screen.
- After retrieving the data but before answering the customer’s question the data gets altered – in the same database by another user or process, i.e., we do not even talk about multiple systems involved, just about concurrent data access inside a single system.
- Question: Will the answer of the clerk be based on the current data, i.e., will the data be updated on the screen. Or will the answer be based on the old data that was valid until some seconds ago but is invalid now?
In 99+% of all cases the answer is “The answer will be based on the old data” – even if it usually takes a bit of discussion until the advocates of strong consistency will admit it.
An update on the screen would require a back-channel from the data source to the user frontend. For a web application this would require something like web sockets, signaling data updates to the web client. This means significant added effort which in most situations is not considered being worth it. Therefore most frontends do not update the data displayed without the user explicitly requesting it. 1
This means, from a user experience perspective your strong consistency demands even inside the boundaries of a single system melt down to some kind of eventual consistency at best. A user can still see outdated data.
Bottom line: Strong consistency does not exist outside IT systems. Strong consistency inside IT systems usually also melts down to eventual consistency from a user experience perspective – even with just a single system involved. Hence, requirements of the type “the user needs it” (a.k.a. “business reasons”) do not make sense. Insisting in such requirements across system boundaries only make the systems more fragile and error-prone.
Misunderstanding eventual consistency
From my observations, a big driver of people insisting in strong consistency across system boundaries is that they have not understood what eventual consistency means. If you talk with people, they often think that eventual consistency means long update delays of hours or longer between systems as default – or just potentially propagating the updates. 2
This is a big misunderstanding. First of all, eventual consistency means guaranteed consistency. In practice it also means basically instantaneous updates in 99,9+% of all cases. From the outside, in most situations you cannot distinguish eventual from strong consistency.
Eventual consistency just means that it can happen that updates take longer and then you may see different states in different parts of your application landscape. But most of the time, it feels exactly like strong consistency. 3
The biggest difference is that going for eventual consistency across multiple systems means you do not run into the availability trap because the “all or nothing” (atomicity) guarantee does not exist. 4
In this post, we discussed why the “strong consistency is needed for business reasons” requirement is void.
In the next post, we will discuss the actual value of strong ACID-like consistency (Spoiler: Strong consistency has significant value, but it is different from what most people think). Stay tuned … ;)
You could argue that any record currently accessed by a user could be locked for update, either by a database or an application lock. While this might look like a feasible solution, usually it is not a practical one. It not only prevents users regularly from getting their work done. It also basically makes it impossible to run any background updates while online users can access the system. In practice this throws you back to something like nightly batch updates while locking out the online users for that period of time if you do not want to implement complicated retry mechanisms for the batch updates. It also leaves you with the problem of forgotten locks that need to be removed (usually manually). And if you would try to integrate such a system into a distributed transaction, the success probability of these transactions would suffer dramatically. Thus, while looking like a good solution in theory, the practical downsides usually outweigh the theoretical advantages by far. ↩︎
Especially in Germany, I often see “eventual consistency” being translated as “eventuelle Konsistenz” which in English means potential consistency. Here people fall prey for the similar looking and sounding words “eventual” (English) and “eventuell” (German) – which mean very different things. ↩︎
Let us exclude the heretic thought that update delays of a day or longer were – and often still are – perfectly okay with batch updates. Yes, business needs have changed. But also today there are still a lot of places where update delays of seconds, minutes or even hours do not make any difference in terms of business impact. Thus, it is important to distinguish between an actual business need and arbitrary demands without any added business value. ↩︎
Actually, it means a bit more – especially if you need to recover from a network partition failure, trying to reconcile the data. This can be non-trivial and means additional implementation effort you need to invest to avoid running into permanent data inconsistencies. But in a reasonably set up system landscape these situations occur rarely (especially longer lasting network partitions that affect a lot of updates) and most of the time from the outside an eventually consistent system cannot be distinguished from a strongly consistent one. ↩︎