Errors, decoded.

The strings you paste into a search engine at the worst possible moment. Each page takes one of them apart the way a senior engineer would at the keyboard: the exact symptom with real output, the commands that narrow it down and what each branch of the output means, the causes ranked by how often they turn out to be the answer, and the fix for each one. No "have you tried restarting" — the actual investigation, written down.

How these pages work

An error string is a symptom wearing a name badge. CrashLoopBackOff doesn't tell you what crashed, exit 137 doesn't tell you who sent the kill, and "connection reset by peer" doesn't tell you which peer — or whether it was a peer at all. So every page here has the same spine: the symptom as you actually see it, a short diagnosis where each command's output routes you to the next step, the causes ranked by how often they're the answer, and a concrete fix per cause. Plus the two or three things people reliably get wrong, because those cost more hours than the errors themselves.

If you're working through the apprenticeship, this shelf is where the build task sends you when a project breaks — which it will, on schedule. ECONNRESET and EADDRINUSE are practically part of that task's syllabus.

The errors

Where to go deeper

These pages stop where the specialist material starts. The Linux codex carries the full investigations behind most of them — a held port, a growing process, a suspect network — and the Kubernetes codex covers the machinery (kubelet, probes, the pod lifecycle) that produces half of these statuses in the first place.

Found this useful?

Errors, decoded.

How these pages work

The errors

CrashLoopBackOff

OOMKilled

ImagePullBackOff

connection reset by peer

context deadline exceeded

too many open files

address already in use

exit code 137

Where to go deeper