CAP theorem explained: interactive visualisations of consistency vs. availability

This post includes interactive visualisations so I suggest viewing in your browser, not email.

Sometimes self-taught developers think they can pick up necessary fundamentals on the job. The problem is that often the “folk understandings” you get from other developers are totally inaccurate. The CAP theorem is a great example of this.

The standard explanation goes something like this: “a distributed system can only guarantee two of consistency, availability, and partition tolerance. Pick any two, bro!”

It’s pithy, memorable and wrong in several ways.

Scroll down for some visualisations showing how systems can behave in various circumstances.

What is the CAP theorem?

The CAP theorem, stated by Eric Brewer in 2000 and formally proved by Gilbert and Lynch in 2002, says that a distributed system cannot simultaneously guarantee consistency, availability, and partition tolerance during a network failure.

The three properties have specific definitions:

“Consistency” means every read returns the most recently written value, or an error
“Availability” means every request to a live node gets a non-error response
“Partition tolerance” means the system continues serving requests even when nodes cannot communicate

First problem: partitions are unavoidable

Networking devices fail. Mice chew through cables. Cosmic rays flip bits in memory at just the wrong time. In a distributed system, there is no way to avoid partitions occurring.

If your system is partition intolerant, it will halt until the partition is healed. That means you have neither consistency nor availability. So partition tolerance – continuing to serve some requests – is the baseline. The real choice is what the system does when a partition occurs.

The “pick two” framing makes it sound like you could blithely choose consistency and availability without partition tolerance. That would mean a system that is perfectly consistent and fully available while its nodes can’t communicate. That is impossible.

Second problem: consistency has a specific meaning

The main problem is that the CAP theorem’s definition of consistency is specifically linearizability: every read must return the value of the most recent write, as if the entire system were a single node with a global clock. It is the strong consistency guarantee used by the CAP proof.

That’s a problem because linearizability is a high bar. Many applications don’t need it. The CAP theorem’s conclusion is that you can’t have both consistency and availability during a partition, but that conclusion only applies to linearizability. Weaker consistency models open up different options.

One of the difficulties with studying distributed systems is that the concepts have complicated names like “linearizability” and fiddly definitions. So let’s have some visualisations to depict how systems actually behave under different consistency models. You’ll see that the models vary in how they trade off consistency guarantees against availability.

Linearizable consistency

Under linearizability, every read must return the latest write. During a partition, a non-leader node can’t know that it has the latest data and therefore it must refuse to respond. Otherwise it might return stale data, which violates the guarantee. The result is correctness at the cost of availability.

The example below assumes Node West is the leader for this account and remains able to accept Alice’s write. In a quorum-based two-node system, Alice’s write would block too, but that doesn’t really matter. The point remains: any node that cannot prove it has the latest state must refuse requests rather than guess.

—

Node East’s logs show two rejections: a read error and a rejected withdrawal. When the partition heals, both nodes agree on $20. No incorrect data was ever served. The account was temporarily inaccessible, but it was never wrong.

Generally this is what you want for important things like banking.

Join the mailing list

Get occasional updates about the book and new computer science articles.

Join 1,000+ readers.

No spam. Unsubscribe anytime.

Causal consistency

Causal consistency is weaker than linearizability but stronger than eventual consistency. It does not require every node to show the absolute latest value. It does require cause and effect to stay in order. If Bob follows a link or notification that carries a dependency on Alice’s latest edit, the system must not show Bob an older version that makes the cause disappear.

Systems normally track this with some form of metadata such as a version number, timestamp, or token. In the visual, Bob’s request carries “needs v2”, so Node East knows that returning v1 would be wrong.

—

Bob’s request carries a dependency because he is trying to view Alice’s v2 edit. Node East still only has v1, so returning v1 would violate causality. It has to wait, route the request elsewhere, or return an error. That is less available than eventual consistency.

If Bob wanted to read an unrelated item, Node East could still respond. Whether or not the system is available depends on the specific request made.

Eventual consistency

Eventual consistency goes further: reads and writes are both accepted at any node during a partition. When the partition heals, nodes reconcile. You get maximum availability throughout, at the cost of correctness. Whether this is acceptable depends on what the system handles.

—

In the visualisation, both node logs end with a confirmed purchase. One item but two confirmed orders – problematic! The stock count reconciles to zero just fine, but the system let two people buy the same last item.

Someone’s order has to be cancelled after the fact. That’s probably bad for customer trust. But take a social network. Would it really matter if one of your likes disappeared? Probably not.

Eventual consistency is appropriate where conflicts are either harmless or easy to resolve: DNS propagation, social media like counts or adding items to shopping carts. It is not appropriate for the final commitment when the thing being counted must satisfy constraints (like available stock levels), unless another coordination step prevents the conflict.

How to really think about CAP

The correct way to think about the CAP theorem is not “two outta three ain’t bad” but “what consistency model does my system require, and how can I maximise availability with that model?”

Weaker guarantees buy more availability:

Linearizability protects invariants, at the cost of refusing requests during a partition.
Causal consistency protects cause-and-effect ordering – if you correctly set up dependency tracking (not trivial!) – but does not preserve a single, global, latest state.
Eventual consistency gives you maximum availability but leaves conflict resolution as your problem.

When a system claims to be “consistent” in conversation, it could mean any point on this spectrum. The distributed systems chapter goes further: how linearizability is implemented through consensus protocols like Raft, how weaker consistency models work in production systems like Cassandra, and what “split brain” looks like when an AP system allows conflicting writes to isolated nodes. It is one of the nastier failure modes in distributed computing.

Join the mailing list

Get occasional updates about the book and new computer science articles.

Join 1,000+ readers.

No spam. Unsubscribe anytime.