IV · Queueing theory
Tail latency
What it is
For fan-out queries, system latency is dominated by the slowest of N parallel requests. p99 of one becomes p50 of 100.
Where it lives
Microservice composition, every search index, MapReduce stragglers.
The key insight
Hedged requests (fire two, take the first) cut tail latency at modest extra cost — Dean & Barroso 2013.