CPU cache.
Keeping the tools you reach for most in your apron pocket, not on the far shelf.
A CPU is astonishingly fast — but even RAM cannot feed it quickly enough. Compared to how fast the CPU thinks, fetching from RAM is like stopping work to walk to a far shelf.
So the CPU keeps tiny, ultra-fast caches right beside it, holding the data it just used or expects to use next. Like a chef keeping the busiest tools in an apron pocket instead of the back cupboard.
- 1
Speed tiers, by distance: pocket, near shelf, far shelf, the back storeroom.
- Got it.2
A tool already in the apron pocket is right there — that’s a cache hit.
- Hang on…3
Not in the pocket? Stop and walk to the far shelf — a miss, and a long wait.
- 4
So when it fetches one thing, it grabs the neighbours too, betting you’ll want them.
- 5
Work along a row in order and almost everything is already in reach — hit after hit.
- Again? Really?6
Jump around at random and you’re back to the far shelf every time — miss after miss.
Hits and misses
When the data the CPU wants is already in cache, that is a "hit" and it is almost instant. When it is not, that is a "miss", and the CPU has to wait for the slower RAM — many times longer, during which it largely sits idle.
The cache also bets on the future: when it fetches one value it grabs the neighbours too, guessing you will want them next. Often you do.
Why your data layout matters
This is why how you arrange data can make code several times faster without changing what it computes. Walking through an array in order rides those neighbour-fetches and gets hit after hit. Jumping around memory randomly causes miss after miss, leaving the fast CPU waiting on slow RAM.
Most "why is this loop slow?" mysteries come down to cache behaviour.