Should you add a cache?

Q: Should you add a cache?

Yes, when the same expensive answer is read far more than it changes — but only after you can measure the slow path. Caching trades a little staleness for a lot of speed, and invalidation is the bill. Caching is a bet: that the same answer will be read many more times than it changes, so it is worth keeping a fast copy around instead of recomputing it. When the bet holds — a product page read ten thousand times an hour but edited once a day — caching is one of the biggest wins in systems work. When it does not, you have added complexity and a class of bugs for nothing.

A cache is a bet on your read/write ratio

Caching is a bet: that the same answer will be read many more times than it changes, so it is worth keeping a fast copy around instead of recomputing it. When the bet holds — a product page read ten thousand times an hour but edited once a day — caching is one of the biggest wins in systems work. When it does not, you have added complexity and a class of bugs for nothing.

So the question is never "should I cache?" in the abstract. It is "for this data, how does read frequency compare to write frequency, and how slow is the uncached path?" If reads dwarf writes and the uncached path is expensive, cache it. If writes are frequent or the path is already fast, do not.

When a cache is worth it

The sweet spot is read-heavy, slow-to-compute, shared data. A query that takes tens of milliseconds, or an aggregation that takes hundreds, collapses to a sub-millisecond read from an in-memory store like Redis. Multiply that across every request and you have cut latency and database load at once. The load reduction is often the bigger prize: the cache absorbs the repetitive reads so the database has headroom for the work only it can do.

That same absorption saves you under a spike. When a popular item suddenly gets a flood of traffic, a cache turns thousands of identical queries into one query and thousands of cache hits, and the origin barely notices. Public, shared, expensive-to-produce values are where a cache earns its keep cleanly.

When to skip it and just fix the query

Profile before you cache. Caching a path that was never the bottleneck speeds up the wrong thing while the real one keeps hurting, now hidden behind a layer that makes it harder to see. Often the honest fix is cheaper: a missing index turns a sequential scan into a millisecond lookup, and now the uncached path is fast enough that a cache buys you nothing but invalidation bugs.

Skip caching, too, when writes dominate reads — entries go stale or get evicted before anyone benefits — and when the data must be exact on every read. A balance or the last unit of stock cannot tolerate a staleness window, so a cache there is a correctness bug waiting to happen.

What a cache actually costs: invalidation

The old joke is that the two hard problems in computing are naming things, cache invalidation, and off-by-one errors. It is really about invalidation, because that is where caching gets hard. The instant you keep a copy, you own keeping it honest: when the underlying data changes, the copy is wrong, and you have to decide what to do.

The blunt tool is a TTL — let entries expire after, say, sixty seconds, and accept that users may see stale data for up to a minute. Fine for a view count, unacceptable for a price. The precise tool is explicit invalidation: when you write the data, you delete or update the cached copy. That is more correct and more code, and it is easy to miss one of the several places that write the data, leaving a stale entry lingering with no expiry.

The money side: RAM against the database

A cache is RAM, and RAM is roughly an order of magnitude pricier per gigabyte than the disk under your database. The economics only work because of skew: a small hot set absorbs a huge share of the reads, so you keep a sliver of the data in expensive memory and let it answer most of the traffic. Hit rate is the whole ledger — every hit is a read your database did not have to serve, and the cost per avoided read is the cache bill divided by the hits.

What the cache actually buys you is deferred database scaling. Database capacity is the expensive, hard-to-add resource — a bigger primary, another replica — and a cache node in front costs far less than the next size up of the thing it protects. That is the honest framing: a cache is not cheap storage, it is a cheap way to postpone the most expensive upgrade on your bill.

The rule: look at what the cache lets you not buy. If removing it tomorrow would force a database upsize or another replica, it is paying for itself with room to spare. If the database would shrug, the cache is a latency feature you are funding in RAM — sometimes worth it, but call it what it is and size it small.

The traps: stale reads, leaks, and the stampede

The first is having no invalidation plan, so users see data long after it changed and you learn it from a support ticket. The second is keying a cache wrong — caching per-user data under a shared key, so one person's dashboard gets served to the next person. That is a security incident, not a performance bug, and it is easy to do by accident.

The third is the thundering herd. A popular entry expires, every client that wanted it misses in the same instant, and they all stampede the origin to recompute the same value. The database, idle a moment ago, takes a thousand identical queries in one millisecond and falls over — the exact failure the cache was meant to prevent. The fixes are known: stagger expiries with jitter, let one request recompute while others serve slightly stale, or refresh hot keys in the background before they expire. You just have to plan for it.

A cache vs just making the query fast

A cache hides a slow operation behind a fast copy; fixing the query removes the slowness at the source. The fix is almost always the better first move when it is available — an index, a smarter query, a denormalised column — because it has no staleness window, no invalidation code, and no new failure modes. There is nothing to keep honest.

A cache wins when the work is irreducible: an expensive aggregation, a third-party API call, a computation that is slow no matter how you write it, read far more than it changes. Then the copy is the only way to make it fast. So the order is: make it fast at the source first, and cache what is still expensive after that.

How to add a cache without regret

Wait until you can point at a measured slow path read far more than it changes. Start with the simplest thing that works: a TTL long enough to help, short enough that the staleness is tolerable for that specific data. Live with it, watch your hit rate, and reach for explicit invalidation only when the staleness actually starts to matter.

Be deliberate about what is cacheable. Public, shared, slow-to-compute data is the safe zone. Private, per-user, must-be-exact data is the danger zone — cache it only with careful keying and clear eyes about the staleness you are accepting. A cache is a targeted bet, not a layer you spread over everything.