Should you use a load balancer?

Q: Should you use a load balancer?

Yes, the moment you run more than one copy of a service. It spreads traffic, routes around dead instances, and lets you deploy with no downtime. Table stakes, not an optimisation. Unlike most of these guides, this one has a near-automatic answer: the moment you run more than one copy of a service, you need something in front to spread traffic across them. It is not an optimisation you graduate into. It is the basic plumbing of running more than one instance of anything, and the only real question is which kind, not whether.

The real question is whether you run more than one instance

Unlike most of these guides, this one has a near-automatic answer: the moment you run more than one copy of a service, you need something in front to spread traffic across them. It is not an optimisation you graduate into. It is the basic plumbing of running more than one instance of anything, and the only real question is which kind, not whether.

It feels like a bigger decision than it is because people conflate "do I need a load balancer" with "do I need to scale out." Those are the same decision. If you have decided to run two instances — for capacity or for redundancy — you have already decided you need a load balancer, because something has to answer "which instance gets this request?"

When you need a load balancer

The instant you have two or more identical instances. The obvious job is distribution: spreading requests across the pool so no instance gets buried while others sit idle. The two jobs that matter more in practice come along with it.

Health checking and failover. The balancer continuously probes each instance and stops routing to one the moment it stops answering. When an instance crashes at 3am, traffic quietly flows to the healthy ones and users never notice, instead of a third of requests returning errors until someone wakes up. That turns "an instance died" from an incident into a non-event.

And zero-downtime deploys. You roll out by draining one instance at a time: take it out of rotation, update it, health-check it, put it back, repeat. Users keep being served by the others throughout, so you ship in the middle of the day without a maintenance window. A single stable address in front of a pool that grows, shrinks, and gets replaced is what makes all of it possible.

When you can skip it

If you run exactly one instance with no redundancy plan, there is nothing to balance yet — though a single crash is then downtime, so the honest move is usually to add a second instance and a balancer rather than stay single. If your managed platform already fronts your service with a load balancer, as most PaaS and container platforms do, adding your own layer is redundant and just another hop. And if traffic is trivially low and one small box absorbs it comfortably, you can wait.

None of these are "load balancers are not worth it." They are "you have not yet reached the point where you run more than one instance." The decision is still mechanical; you are just early.

The traps: bad health checks and hidden state

The first trap is a health check that does not check anything real. If it just pings the port or hits a route that returns 200 without touching the database, the balancer keeps sending traffic to an instance that is up but broken — connected to a dead database, out of memory, half-started. A good health check exercises the dependencies the service actually needs, so "healthy" means "can serve a request," not "the process is running."

The second is sticky sessions papering over broken state-sharing. If your app keeps session data in each instance's local memory, the balancer has to pin each user to one instance so they keep seeing their own data. That works until that instance dies or gets drained for a deploy, and those users lose their session mid-flow. Sticky sessions are usually a smell that state should live somewhere shared, like Redis or the database, so any instance can serve any user.

The trap: making the balancer the single point of failure

You add a load balancer to remove single points of failure, and if you are not careful, the balancer becomes one. A single balancer in front of a healthy pool means the pool is only as available as that one box. The whole point evaporates when it goes down.

The fix is the same logic one level up: run the balancer redundantly, or use a managed load balancer from your cloud provider that is already built to be highly available behind a single address. For most teams the managed option is the right call — cheap, someone else's job to keep alive, and it solves the redundancy problem for you.

A load balancer vs DNS round-robin

DNS round-robin is the poor cousin: hand out several IPs for one hostname and let clients rotate through them. It spreads traffic with zero infrastructure, but it is blind. DNS does not know an instance is dead, so it keeps handing out the address of a crashed box, and clients cache DNS records for minutes to hours, so removing a bad instance does not take effect for a long time. There is no failover and no clean way to drain for a deploy.

A load balancer sits in the request path, so it knows instance health in real time, fails over in seconds, and drains instances cleanly for zero-downtime releases. Round-robin DNS can be a crude layer for spreading load across regions or balancers, but it is not a substitute for a real load balancer in front of a pool. When health and fast failover matter — which is almost always — you want the balancer.

How to set one up without regret

Reach for a managed load balancer from your cloud provider. It comes highly available behind a single address, so you sidestep the single-point-of-failure problem entirely, and keeping it alive is not your job.

Set up health checks that exercise real dependencies, and keep session state out of instance memory so any instance can serve any user. Do those two things and a load balancer stops being a decision and becomes the quiet foundation everything else stands on.