IV · Queueing theory

Tail latency

What it is

For fan-out queries, system latency is dominated by the slowest of N parallel requests. p99 of one becomes p50 of 100.

Where it lives

Microservice composition, every search index, MapReduce stragglers.

The key insight

Hedged requests (fire two, take the first) cut tail latency at modest extra cost — Dean & Barroso 2013.