Trading consistency for availability (Week 9, Friday—March 14, 2014)

Emphasizing availability when partitions occur

Source: CAP Twelve Years Later

CAP Venn diagram with A and P subset highlighted

So long as there are no network partitions (most of the time), a system can be strongly consistent and available:

It won’t come free—there will still be latency costs to sustaining the illusion of a single system

Time outs define “partitions” for a client

Every service request has a time out

The partition decision: How your code responds to timed-out requests determines whether your system is strongly consistent or available:

  1. Keep retrying (maybe with longer time out limits) until requests start working (give up availability but remain strongly consistent); or

  2. Enter partition mode, continue serving the user (give up strong consistency but remain available)

Each client makes its own decision about “partition”. One client can see a partition while another client does not.

How long do we set the time out value?

Relaxing consistency

If we abandon the requirement that all users see the same order of updates (strong consistency), what do we gain?

“Eventual consistency” is an ambiguous promise: If you stop updating the system, and wait “long enough”, the system will converge on “some value”—every user will eventually see the same value.

Those are the best you can do after a partition

Maintaining business rules and data structure invariants

An application typically has business rules it must maintain

These imply invariants across your data structures

Relational database systems offer key constraints to ensure some invariants

You can’t guarantee the invariants if you keep running while partitioned

But in many cases you have compensation strategies

When designing for partitioned operations, which invariant violations can you compensate, which invariants must you never violate?

Guide to reading for next class

Read Distributed systems for fun and profit, Chap. 5: up to but not including “Replica synchronization: gossip and Merkle trees”.

Important points: “Eventual consistency with probabilistic guarantees”—this is the normal definition of eventual consistency.

Points not important to this course: “eventual consistency with strong guarantees” (CRDTs and CALM)—this remains a research topic, with few to no applications built using this concept.

Important section: Reconciling different operation orders