HyperLogLog
Count distinct, in 12 KB.
Counting unique elements in a stream of billions with sub-percent error in 12 KB of memory. The trick: hash each element; for each, count the leading zeros in the hash. The maximum leading-zero count seen is a stochastic estimator of the cardinality.
Partition the hash output into m buckets. For each bucket, track the maximum leading-zero count of all hashes that landed in it. The harmonic mean of 2^count across buckets, with bias correction, estimates the cardinality. Standard error ≈ 1.04/√m.
Redis (PFCOUNT, PFADD). BigQuery APPROX_COUNT_DISTINCT. Druid uniques. Snowflake APPROX_COUNT_DISTINCT. Every "unique users" metric at internet scale.
Mergeable: HLL of region A union HLL of region B = HLL of A∪B, with no recomputation. That property is why HLLs power dashboards across sharded data — each shard returns its sketch, the gateway merges.