t-digest
Quantiles, fairly accurate, fully streaming.
Computing exact percentiles requires sorting all the data. Approximate methods (GK summary, Q-digest) work but degrade at the tails — exactly where p99 and p99.9 live. Dunning's t-digest cleverly allocates more resolution to the tails by clustering points with non-uniform weights.
Maintain a list of "centroids" (mean, weight) compressed by a scale function that gives more resolution near 0 and 1 quantiles. Inserts merge into nearest centroid or create a new one. Quantile queries interpolate between centroids.
Elasticsearch percentiles. Cassandra latency aggregates. Apache Druid. Every time-series database that ships pre-aggregated p99s — including Prometheus's histogram_quantile.
t-digests are mergeable. Each shard or pod can compute its own; the central server merges them for the cluster-wide p99. Without merge, you cannot do approximate percentiles across distributed systems.