Exponential backoff.
When a call fails, wait a little, then twice as long, then twice again — instead of hammering away.
When a request to another service fails, retrying is reasonable — the blip might be temporary. But retrying immediately and forever is a great way to make things worse: a struggling service gets pounded by a flood of instant retries and never gets room to recover.
Exponential backoff is the polite way to retry. Wait a short moment, try again; if it still fails, wait twice as long; then twice as long again. Each failure backs you off further, so a brief glitch is handled quickly while a real outage is met with patience instead of a stampede.
- 1
You knock on a door and nobody answers — the call to another service just failed.
- 2
So you wait a short moment, then try again — a quick first retry handles a blip.
- 3
Still failing? Each wait doubles: 1s, 2s, 4s, 8s — backing off further every time.
- 4
This matters because everyone retrying instantly is a stampede that keeps the service down.
- 5
Adding jitter, a little randomness, scatters the retries so they do not arrive in sync.
- 6
Finally, cap how long you wait and stop after a limit, so you never retry forever.
Why doubling the wait helps
If a dependency is briefly overloaded, the worst thing every client can do is retry instantly and in unison — that "retry storm" keeps the service flat on its back. Backing off exponentially means clients quickly thin out their attempts, giving the struggling service breathing space to drain its backlog and recover. A short glitch still gets a fast first retry, so you do not pay much for the common case.
Jitter and limits make it work
Pure doubling has a flaw: if many clients failed at the same instant, they all retry at the same instants too — synchronised waves of load. Adding jitter (a random spread to each wait) scatters the retries so they no longer arrive together. Just as important, you cap the maximum delay and stop retrying after a set number of attempts or a deadline, so a permanently-down dependency does not leave callers retrying forever.