CRDTs
A conflict-free replicated data type is a data structure designed so that concurrent updates on different replicas always merge into the same result, no matter what order the updates arrive in or how many times they are redelivered. The trick is algebraic, not procedural: if the merge operation is associative, commutative, and idempotent, convergence is a theorem, not a hope. CRDTs trade away the ability to enforce global invariants in exchange for never needing a round of coordination to resolve a conflict.
The algebra: a join-semilattice
A state-based CRDT (a CvRDT) is built on a join-semilattice: a partial order with a least-upper-bound (the join, written ⊔) defined for every pair of states. The merge between two replicas is that join. Because a semilattice join is:
- Associative —
(a ⊔ b) ⊔ c = a ⊔ (b ⊔ c), so grouping does not matter. - Commutative —
a ⊔ b = b ⊔ a, so order does not matter. - Idempotent —
a ⊔ a = a, so redelivering the same state is harmless.
…it does not matter which order replicas exchange states, how many times a message is duplicated, or how it is batched. As long as every update eventually reaches every replica, all replicas converge to the same value. This property is called strong eventual consistency: replicas that have seen the same set of updates have identical state, with no conflict-resolution round.
Two flavours: state-based and op-based
| Kind | What ships | Network requirement |
|---|---|---|
| State-based (CvRDT) | The full (or delta) state; merged with join on receipt | Only needs eventual delivery; duplicates and reordering are fine |
| Op-based (CmRDT) | Each operation, applied at every replica | Needs exactly-once, causally-ordered delivery of ops |
State-based CRDTs are robust on lossy networks because the merge absorbs duplicates, but shipping whole states is expensive — delta-CRDTs send just the changed part to fix that. Op-based CRDTs send small operations but push the hard delivery guarantees onto the messaging layer. The two are mathematically equivalent in expressiveness; the choice is an engineering trade-off about what your transport can promise.
Counters
The simplest CRDTs. A G-Counter (grow-only) is a vector of per-replica counts; each replica only increments its own entry. The value is the sum; the merge takes the element-wise maximum. Because each replica owns its slot, two concurrent increments never collide.
A PN-Counter supports decrements by pairing two G-Counters — one for increments, one for decrements — and reporting their difference. This is how you build a distributed "like count" or inventory tally that never loses an increment to a concurrent update.
Sets — and why LWW is not enough
A grow-only set (G-Set) merges by union and can never remove an element. Removal is where naive designs break. A 2P-Set adds a tombstone set, but once removed an element can never be re-added.
The workhorse is the OR-Set (observed-remove set). Each add tags the element with a unique token; a remove only deletes the tokens it has actually observed. So an add concurrent with a remove on the same element wins — the add's fresh token was never observed by the remove. This matches user intuition: if you add an item to a shopping cart at the same moment a teammate clears it, your item survives.
Sequences — collaborative text
Editing shared text is the hardest CRDT problem: concurrent inserts at the same position must converge to one agreed order without a coordinator. Sequence CRDTs (RGA, Treedoc, LSEQ, Logoot, and Yjs's YATA) give every character a unique, densely-orderable identifier so that any two inserts have a deterministic relative order on every replica.
This is the technology under collaborative editors. Automerge and Yjs are the popular open-source libraries; products like Figma, Linear, and various local-first apps use CRDT or CRDT-like merge to let many people edit offline and reconcile cleanly when they reconnect.
Limits — what CRDTs cannot do
- No global invariants. CRDTs cannot enforce "balance ≥ 0" or "at most one booking for this seat" across replicas, because that needs coordination — exactly what CRDTs avoid. Use consensus for invariants; use CRDTs for mergeable state.
- Metadata and tombstones grow. OR-Set tokens and removed-element tombstones accumulate. Garbage-collecting them safely (without resurrecting deleted data) is the recurring practical headache.
- Merge semantics may surprise users. "Add wins over concurrent remove" is a deliberate choice; a different app might want remove-wins. The type encodes a policy, and you must pick the one your product actually needs.
Further reading
- Shapiro, Preguiça, Baquero, Zawirski — A Comprehensive Study of Convergent and Commutative Replicated Data Types (2011) — the foundational CRDT paper with the full catalogue.
- Kleppmann et al. — CRDTs and local-first software — modern applications and the local-first movement.
- Automerge — a production JSON CRDT library; the README is a readable primer.
- crdt.tech — a curated index of CRDT papers, implementations, and talks.