Should you use it?

Should you shard your database?

Not until one machine genuinely cannot keep up. Exhaust read replicas, indexes, caching, and a bigger box first. Sharding is the heaviest, most irreversible tool in the drawer.

Sharding solves exactly one problem

Sharding splits one logical database into many physical ones, each holding a slice of the rows, so no single machine has to hold all the data or absorb all the writes. It is the heaviest, most irreversible infrastructure work most teams ever do. The right time to reach for it is "we have run out of every cheaper option," not "we expect to grow."

The thing to internalise: sharding fixes a single primary node that cannot keep up with writes or cannot hold the dataset. That is the whole job. If that is not your problem, sharding will not help, and it will hand you a pile of new ones. So the first task is being precise about what is actually maxed out.

When sharding is genuinely time

The honest signal is narrow. A single write primary saturated on throughput, or a dataset that no longer fits on the largest node you can buy — and a growth curve confirming one node will never be enough. You have already added replicas, indexes, and a cache, and bought a bigger box, and writes are still the wall. At that point sharding stops being premature and starts being necessary.

You also want a shard key that spreads both data and traffic evenly, where most queries can be answered from one shard. If your access pattern gives you that, sharding scales close to linearly. If it does not, fix the cheaper layers and wait.

When to exhaust the cheap runway first

Most "the database is slow" problems are reads, and reads have easy answers. Add a couple of indexes — a missing one is the difference between a sequential scan and a millisecond lookup. Put a cache in front of the hot queries. Add read replicas so reads spread across copies while the primary handles writes. That combination alone carries an enormous number of products their entire life.

Then the boring, underrated move: buy a bigger machine. Vertical scaling has a ceiling, but it is high — modern instances offer dozens of cores and hundreds of gigabytes of memory, and doubling your box is an afternoon versus a quarter of engineering. Skip sharding, too, when your queries constantly span all the data: global counts and cross-entity joins get slower and harder once the rows are scattered across shards.

What sharding actually costs: the shard key

Once you commit, everything hinges on the shard key — the field that decides which shard a row lives on. The failure mode is the hot shard. Shard by customer when one enterprise customer drives 40% of your traffic, and that customer's shard melts while the others sit idle. You scaled horizontally and still have a single overloaded node, only now it is harder to reason about.

The other failure is the fan-out query. If the key does not match how you read the data, common queries hit every shard and merge the results — slower and more fragile than the single box you started with. The cruelest part is that the key is effectively permanent. Change it once billions of rows are placed and you are resharding: physically moving data across shards while the system stays live, without dropping writes. Teams call that a multi-month project for a reason.

The trap: sharding early to be safe

"Shard early so we never have to scramble" is exactly backwards. Sharding early means choosing the key when you understand your access patterns least, which is precisely when you are most likely to choose wrong. The resharding pain is really the cost of a bad key, paid much later when it is far more expensive to fix.

So the runway a bigger instance buys is almost always cheaper than the engineer-months sharding costs, and the extra time teaches you what your real query mix and skew look like. The longer you can responsibly wait, the better your key will be when you finally pick it.

Sharding vs read replicas (and a bigger box)

These solve different walls, and people conflate them. Read replicas scale reads: copies of the data serve queries while the primary takes writes. A bigger box scales everything vertically until you hit the hardware ceiling. Both are cheap, reversible, and an afternoon of work. Sharding is the only one that scales writes past a single primary — and it is expensive, near-permanent, and a real distributed system to operate.

So the order is fixed. Replicas and a cache for read load, a bigger instance for headroom, and sharding only when writes specifically saturate the largest single primary you can buy. If you cannot point at saturated writes, you want one of the first three, not the last one.

How to shard without regret

Pick a key that matches your dominant access pattern and spreads load, and validate it against your real query mix and your skew before committing, not after. The validation is the work; the migration is the easy part by comparison.

Lean on managed machinery rather than hand-rolling routing — a database that shards for you, or a layer like Vitess — because the operational surface is enormous. And migrate incrementally: move one slice, prove it, then continue, rather than flipping everything at once. Sharding is sometimes the right answer. It is just almost never the early one.

Quick reference

When it fits, when it doesn't

Reach for it when

  • A single primary is maxed on disk, memory, or write throughput and vertical scaling has run out.
  • You have a shard key that spreads both data and traffic evenly with no obvious hotspots.
  • Most queries can be answered from one shard, so you rarely need to fan out across all of them.
  • The dataset is large enough that no single node can hold it.

Skip it when

  • Reads are the bottleneck — add read replicas and a cache before splitting writes.
  • A bigger instance or better indexes would buy you another year; that runway is usually cheap.
  • Your queries constantly span all the data (global counts, cross-entity joins) — sharding makes those slower and harder.

Common mistakes

  • Choosing a shard key that creates a hot shard — the celebrity account or busiest region swamps one node.
  • Picking a key you can never change, then discovering resharding live is a months-long project.
  • Sharding before trying replicas, indexes, and caching, then owning a distributed system you did not need.
Settle an argument?