Should you use it?

Should you add a cache?

Yes, when the same expensive answer is read far more than it changes — but only after you can measure the slow path. Caching trades a little staleness for a lot of speed, and invalidation is the bill.

A cache is a bet on your read/write ratio

Caching is a bet: that the same answer will be read many more times than it changes, so it is worth keeping a fast copy around instead of recomputing it. When the bet holds — a product page read ten thousand times an hour but edited once a day — caching is one of the biggest wins in systems work. When it does not, you have added complexity and a class of bugs for nothing.

So the question is never "should I cache?" in the abstract. It is "for this data, how does read frequency compare to write frequency, and how slow is the uncached path?" If reads dwarf writes and the uncached path is genuinely expensive, cache it. If writes are frequent or the path is already fast, do not.

When a cache is worth it

The sweet spot is read-heavy, slow-to-compute, shared data. A query that takes tens of milliseconds, or an aggregation that takes hundreds, collapses to a sub-millisecond read from an in-memory store like Redis. Multiply that across every request and you have cut latency and database load at once. The load reduction is often the bigger prize: the cache absorbs the repetitive reads so the database has headroom for the work only it can do.

That same absorption saves you under a spike. When a popular item suddenly gets a flood of traffic, a cache turns thousands of identical queries into one query and thousands of cache hits, and the origin barely notices. Public, shared, expensive-to-produce values are where a cache earns its keep cleanly.

When to skip it and just fix the query

Profile before you cache. Caching a path that was never the bottleneck speeds up the wrong thing while the real one keeps hurting, now hidden behind a layer that makes it harder to see. Often the honest fix is cheaper: a missing index turns a sequential scan into a millisecond lookup, and now the uncached path is fast enough that a cache buys you nothing but invalidation bugs.

Skip caching, too, when writes dominate reads — entries go stale or get evicted before anyone benefits — and when the data must be exact on every read. A balance or the last unit of stock cannot tolerate a staleness window, so a cache there is a correctness bug waiting to happen.

What a cache actually costs: invalidation

The old joke is that the two hard problems in computing are naming things, cache invalidation, and off-by-one errors. It is really about invalidation, because that is where caching gets hard. The instant you keep a copy, you own keeping it honest: when the underlying data changes, the copy is wrong, and you have to decide what to do.

The blunt tool is a TTL — let entries expire after, say, sixty seconds, and accept that users may see stale data for up to a minute. Fine for a view count, unacceptable for a price. The precise tool is explicit invalidation: when you write the data, you delete or update the cached copy. That is more correct and more code, and it is easy to miss one of the several places that write the data, leaving a stale entry lingering with no expiry.

The traps: stale reads, leaks, and the stampede

The first is having no invalidation plan, so users see data long after it changed and you learn it from a support ticket. The second is keying a cache wrong — caching per-user data under a shared key, so one person's dashboard gets served to the next person. That is a security incident, not a performance bug, and it is easy to do by accident.

The third is the thundering herd. A popular entry expires, every client that wanted it misses in the same instant, and they all stampede the origin to recompute the same value. The database, idle a moment ago, takes a thousand identical queries in one millisecond and falls over — the exact failure the cache was meant to prevent. The fixes are known: stagger expiries with jitter, let one request recompute while others serve slightly stale, or refresh hot keys in the background before they expire. You just have to plan for it.

A cache vs just making the query fast

A cache hides a slow operation behind a fast copy; fixing the query removes the slowness at the source. The fix is almost always the better first move when it is available — an index, a smarter query, a denormalised column — because it has no staleness window, no invalidation code, and no new failure modes. There is nothing to keep honest.

A cache wins when the work is genuinely irreducible: an expensive aggregation, a third-party API call, a computation that is slow no matter how you write it, read far more than it changes. Then the copy is the only way to make it fast. So the order is: make it fast at the source first, and cache what is still expensive after that.

How to add a cache without regret

Wait until you can point at a measured slow path read far more than it changes. Start with the simplest thing that works: a TTL long enough to help, short enough that the staleness is tolerable for that specific data. Live with it, watch your hit rate, and reach for explicit invalidation only when the staleness actually starts to matter.

Be deliberate about what is cacheable. Public, shared, slow-to-compute data is the safe zone. Private, per-user, must-be-exact data is the danger zone — cache it only with careful keying and clear eyes about the staleness you are accepting. A cache is a targeted bet, not a layer you spread over everything.

Quick reference

When it fits, when it doesn't

Reach for it when

  • The same data is read many times for each time it changes — a read-heavy workload.
  • Computing or fetching the answer is genuinely slow, and you have measured it.
  • A short window of staleness is acceptable for that data.
  • A traffic spike would otherwise hammer the database with identical queries.

Skip it when

  • The data must be exact on every read — balances, remaining stock of the last item.
  • Writes dominate reads, so entries are stale or evicted before anyone benefits.
  • You have not profiled anything yet; caching the wrong thing hides the real bottleneck and adds bugs.

Common mistakes

  • No invalidation plan, so users see stale data long after it changed.
  • Caching per-user data with a shared key, leaking one person’s view to another.
  • A thundering herd on expiry — every client misses at once and stampedes the origin together.
Settle an argument?