ELI5 · Networking & the web

Load balancers.

The host at a busy restaurant, seating each new guest at the emptiest table.

One server can only handle so much traffic before it slows to a crawl. So you run several identical copies and put a load balancer in front of them.

Like a good restaurant host, it greets every arriving request and sends it to whichever server has the most room, so no single one gets mobbed while others sit idle.

1
Clients learn a single address; how many servers stand behind it can change minute to minute.
Station 3 has room — right this way.

2
Like a restaurant host, it sends each request to a server with room to spare.
Everyone alive back there?

3
Health checks run on a timer, not on complaints — a sick server is found before a guest is sent to it.
Skip station 2.

4
A server that stops answering is pulled from rotation — nobody gets seated there.
5
Swamped? Add more identical servers and the load spreads across them all.
I’m back!

6
Scale out, route around failures — diners never notice the drama in the kitchen.

One front door, many kitchens — and nobody waits on a sick one.

Scaling out instead of up

Without a balancer your only option when traffic grows is a bigger, more expensive server (scaling up), and there is always a ceiling. With one, you just add more ordinary servers behind it (scaling out), which is cheaper and has almost no ceiling.

This is the backbone of how large sites survive a sudden spike: spin up more copies, and the balancer spreads the load across all of them.

It also makes failures boring

Because the balancer constantly checks health, a crashed server becomes a non-event — it is simply skipped until it recovers. The same mechanism lets you take servers down on purpose to deploy new code without any downtime, by draining traffic from one at a time.

Pairs well with Reverse proxy

The real version How load balancing works →

← All explainers