ELI5 · Distributed systems

Split-brain.

Two captains both think they’re in charge after the radio cuts out.

Picture a ship with one captain and a crew that takes orders. Now the intercom dies and the ship is cut in two, with neither half able to reach the other. Each half, fearing it has lost its captain, appoints a new one. Now there are two captains giving conflicting orders.

Split-brain is that failure in a cluster. A network partition severs the nodes into groups that cannot talk; each group, unable to see the leader, may elect its own. Both halves accept changes, and the data quietly diverges into two conflicting versions.

  1. 1

    A healthy cluster: one leader, the rest follow its lead.

  2. Hello? Anyone?
    no contact
    2

    The network splits — the two halves can no longer reach each other.

  3. I’m in charge now.
    two captains
    3

    Each half assumes the leader is gone and elects its own. Two captains.

  4. x = 1 x = 2 drifting apart
    4

    Both accept writes, so the two sides drift into conflicting data.

  5. x = 1 x = 2 ? conflict to untangle
    5

    When the link heals, whose version is right? Now there’s a conflict to untangle.

  6. 3 — leads 2 — stands down
    6

    The fix: require a majority (quorum) to lead, so only one side ever can.

A partition makes two leaders; a majority rule makes sure it can’t.

Why a majority rule works

If a cluster insists that a leader must be backed by more than half the nodes, then a partition can have at most one side with a majority. The minority side knows it is outnumbered and steps down rather than electing a rival leader. With only one side able to accept writes, the data cannot fork. This is why clusters are often sized to an odd number — it guarantees a clear majority.

The cost of safety

Refusing to act without a majority means the minority side goes read-only or offline during a partition, even though its machines are perfectly healthy. That is the CAP trade-off in the flesh: when the network splits, you choose consistency (one truth, but part of the system stops accepting writes) over availability (everyone keeps writing, but the data forks). Split-brain is what happens when a system reaches for availability without a plan to reconcile.

The real version Split-brain simulator →
Found this useful?