Bloom filters.
A bouncer’s “definitely not on the list” memory: tiny, fast, and never wrongly says no.
A bouncer with a thousand-name guest list cannot carry the whole book. So imagine a tiny grid of light switches instead. Each name, run through a few stamps, flips a fixed set of switches on. To check a name, you stamp it and look: if any of its switches is off, that name was definitely never added.
A Bloom filter is exactly that trick. It answers “is this maybe in the set?” using a fraction of the memory a real list would need. It can occasionally say “maybe” for something that was never added, but it will never say “no” for something that was.
- All of it? No way.1
A full guest list is huge — too big to carry or check quickly.
- 2
Keep a tiny grid of on/off switches instead — a few bits, not the whole book.
- 3
To add a name, run it through a few stamps and flip those switches on.
- Not on the list.4
To check: stamp the name. Any switch off? It was definitely never added.
- 5
All switches on? It is probably in — but switches can collide, so “maybe.”
- 6
Tiny and fast, never a false “no” — perfect for skipping expensive lookups.
No false negatives, some false positives
The guarantee is one-directional. Because adding a name only ever turns switches on, a name that was truly added will always find all its switches on — so “no” is always trustworthy. But two different names can happen to share switches, so an item that was never added might find all of its switches already on by coincidence. That is a false positive: a “maybe” that turns out wrong.
Why that is still useful
Most real questions are “is this worth a slow lookup?” A database can keep a Bloom filter in memory and check it first: if it says “definitely not here,” the database skips a costly disk read entirely. The occasional false “maybe” just means a wasted check now and then — a tiny price for avoiding the expensive lookup the vast majority of the time.