04 / 28

Linux / 04

ss

A service is "down" but its process is alive. A load test ends and the box is carrying twelve thousand sockets in a state you have never thought about. A health check works from the host and fails from everywhere else. All of these resolve to the same question: what sockets exist on this machine, in what state, and who owns them? That is what ss answers, faster than anything else, with TCP internals no other listing tool shows. This page covers the five invocations worth memorising, decodes the output column by column, walks three production incidents, and ends with a drill that is safe on any machine.

The question it answers

ss stands for socket statistics, and it does one thing: it asks the kernel for the list of sockets on the machine and prints them, with their state, their addresses, the depth of their queues, and — when you ask — the process that owns each one. It ships in the iproute2 package alongside ip, which means it is present on essentially every Linux box you will ever ssh into, including minimal container images where netstat was dropped years ago.

It is the replacement for netstat, and the replacement is not cosmetic. The two tools get their data in different ways. netstat reads /proc/net/tcp and its siblings: text files where the kernel formats every socket on the system into a hex-encoded line, every time, whether you wanted that socket or not, and the tool then parses all of that text back into structures. ss talks to the kernel over a netlink socket using the sock_diag interface. The kernel hands back binary records, no text round-trip, and — this is the part that matters at scale — the filtering happens inside the kernel. Ask ss for only the sockets in one state, or only the ones matching a port, and the kernel never serialises the rest. On a laptop with two hundred sockets the difference is invisible. On a proxy box carrying half a million connections, netstat grinds for the better part of a minute while ss answers in a second or two, and during an incident that gap is the difference between a tool you use and a tool you abandon.

The netlink path also unlocks data the text files never carried. The kernel keeps live TCP state per connection — round-trip time, congestion window, retransmit counts, pacing rate — and ss can dump all of it per socket. netstat never could. lsof can tell you a process holds a socket, but it cannot tell you the socket's queues are backing up; tcpdump can show you packets on the wire, but not how the kernel is accounting for them. ss sits in between: the kernel's own bookkeeping for every connection, on demand.

The five invocations that matter

Like every tool in this series, ss has a long manual and a short working set. One invocation is muscle memory; the other four are variations you reach for when the first one raises a question.

Invocation	What it shows	When you reach for it
`ss -tlnp`	TCP sockets, listening only, numeric addresses, owning process	First command on any "is the service even up" question. Type it as one word.
`ss -s`	A summary: socket counts per protocol and per state	Orientation. Is this box carrying 200 connections or 200,000? How many in TIME-WAIT?
`ss state time-wait`	Only sockets in one TCP state	Counting a TIME-WAIT pile-up, finding stuck CLOSE-WAIT sockets, isolating ESTABLISHED
`ss -ti`	Per-socket TCP internals: rtt, cwnd, retrans, pacing rate	"The connection is up but slow" — congestion and loss become visible per connection
`ss -tn dst 10.0.9.55`	Sockets filtered by peer address or port	"What do we have open to that database?" — also `sport`, `dport`, `src`

The first one deserves unpacking because you will type it more than the rest combined. -t selects TCP (use -u for UDP, or both together). -l restricts to listening sockets, which is usually the question. -n turns off DNS and service-name resolution so the tool prints numbers immediately instead of stalling on a resolver — the same reflex as -nP on lsof, for the same reason. -p attaches the owning process name, PID, and descriptor number to each line, and needs root to see processes that are not yours. Drop the -l and you get established connections instead of listeners. That one toggle covers most of what netstat -tlnp and netstat -tnp used to do, faster.

The state filter reads like a sentence and is worth knowing as a family: ss state established, ss state time-wait, ss state syn-sent, and the keyword bundles ss state connected (everything except listeners and closed) and ss state bucket (the minisockets: TIME-WAIT and SYN-RECV). Address filters compose with it: ss -tn state established dport = :5432 is "every established TCP connection to port 5432," which is the question "how many connections do we actually hold to the database" answered exactly, with no grep.

Why the filters beat grep. Piping ss through grep works until it lies to you: a port number that matches an ephemeral port on the wrong side, an address that substring-matches another. The filter language matches on the parsed socket fields, not on text, and the kernel applies the state part before the data even reaches userspace. On a busy box, ss state time-wait is not just cleaner than ss -ta | grep TIME — it is doing strictly less work.

Reading the output

Here is the muscle-memory invocation on a box running an nginx front and a Java service, with redis on loopback. Run it with sudo so the Process column is complete.

$ sudo ss -tlnp
State    Recv-Q   Send-Q   Local Address:Port   Peer Address:Port   Process
LISTEN   0        511      0.0.0.0:80           0.0.0.0:*           users:(("nginx",pid=1290,fd=8))
LISTEN   0        4096     127.0.0.1:6379       0.0.0.0:*           users:(("redis-server",pid=912,fd=6))
LISTEN   0        4096     [::]:8080            [::]:*              users:(("java",pid=41327,fd=89))

One row of ss -tlnp, every column labelled. The Recv-Q/Send-Q pair changes meaning with the state — that is the next thing to learn.

Three of these rows carry an operational lesson in the Local Address column alone. nginx is bound to 0.0.0.0:80: every IPv4 interface, reachable from outside. redis is bound to 127.0.0.1:6379: loopback only, invisible to the network no matter what the firewall says, which is exactly right for a local cache and exactly wrong for a service other hosts need. The Java service is on [::]:8080: the IPv6 wildcard, which on a default Linux config (with bindv6only off) accepts IPv4 connections too, so it behaves like "everything." A surprising number of "the service is unreachable" incidents end at this column, and we will walk one below.

Now the pair everyone misreads. Recv-Q and Send-Q mean different things depending on the socket's state, and the LISTEN meaning is the one nobody teaches. On a listening socket, Recv-Q is the number of connections the kernel has fully established on the application's behalf but the application has not yet picked up with accept() — the current depth of the accept queue. Send-Q is that queue's capacity: the backlog the application asked for, capped by net.core.somaxconn. The Java row above reads "zero connections waiting, room for 4096" — healthy. A LISTEN row where Recv-Q hovers near Send-Q is an application that is not accepting fast enough, and that is a real incident pattern covered below.

On an established socket, the pair becomes byte counts. Recv-Q is bytes the kernel has received and ACKed that the application has not yet read() — data sitting in the receive buffer waiting for the app. Send-Q is bytes the application has written that the remote end has not yet acknowledged — in flight, or queued waiting for the congestion window to open. Both zero means everyone is keeping up. Recv-Q growing means this process is slow to read. Send-Q growing means the path or the peer cannot drain what we are sending. Two numbers, and you can already tell which side of a slow connection is at fault.

On an established socket the queues are byte counts in the kernel's two buffers. Each one points at a different culprit when it grows.

Add -i and each connection grows a second line of TCP internals — the data that used to require a packet capture and patience:

$ ss -tni dst 10.0.9.55
State  Recv-Q  Send-Q     Local Address:Port     Peer Address:Port
ESTAB  0       36720      10.0.4.12:8080         10.0.9.55:49210
	 cubic wscale:7,7 rto:204 rtt:1.5/0.75 mss:1448 cwnd:24 bytes_acked:8412392
	 segs_out:6244 segs_in:3107 send 185Mbps lastsnd:4 pacing_rate 222Mbps retrans:0/12 rcv_space:14480

Three of these repay attention. rtt:1.5/0.75 is the smoothed round-trip time and its variance in milliseconds — the kernel's live estimate, per connection, which makes "is it the network or the app" answerable in one glance. cwnd:24 is the congestion window in segments: multiply by mss and divide by the rtt and you have the connection's current ceiling on throughput. retrans:0/12 is retransmissions, current and lifetime — a number that climbs while you watch is packet loss happening to this specific connection, no capture needed. The mechanics behind these numbers, and why the congestion window moves the way it does, are the subject of the TCP page in the networking stack.

Three production scenarios

"The service is unreachable" — is it even listening, and where?

Health checks fail from the load balancer, but the process is running and its own logs look calm. Before anyone opens a firewall ticket, ask the cheaper question first: is anything listening on that port at all, and on which address?

$ sudo ss -tlnp sport = :9090
State    Recv-Q   Send-Q   Local Address:Port    Peer Address:Port   Process
LISTEN   0        128      127.0.0.1:9090        0.0.0.0:*           users:(("metrics-svc",pid=2204,fd=11))

There is the incident, in one column. The service is up and listening — on loopback. Every connection from the host itself succeeds, which is why "I curled it and it works" was true, and every connection from the load balancer is refused before any firewall is consulted, because no socket is bound on an address the outside world can reach. The usual causes are a default config that binds 127.0.0.1 for safety, or a container publishing a port that the process inside only opened on loopback. The fix is the service's bind address, not the network. The inverse failure exists too: nothing listed at all means the process never bound the port — it crashed during startup, or bound a different port than the config claims. And if the port is taken by something unexpected, that is a different walk, documented in what's holding this port? When you need the same answer keyed by process rather than by socket, lsof covers it from the other direction.

Twelve thousand TIME-WAIT sockets after a load test

The load test ends, someone runs ss -s, and the summary line says timewait 12414. Alarm follows. Most of the time the alarm is wrong, and knowing why is the difference between a calm shrug and an afternoon of cargo-cult sysctl tuning.

$ ss -s
Total: 1289
TCP:   13703 (estab 214, closed 13066, orphaned 0, timewait 12414)
$ ss -tn state time-wait | head -3
Recv-Q  Send-Q     Local Address:Port      Peer Address:Port
0       0          10.0.4.12:8080          10.0.9.40:51202
0       0          10.0.4.12:8080          10.0.9.40:51209

TIME-WAIT is not a leak and not an error. When a TCP connection closes, the side that sent the first FIN holds the dead connection's identity for a fixed period — sixty seconds on Linux — so that a delayed packet from the old connection cannot be mistaken for traffic on a new connection that reuses the same address pair, and so the final ACK can be resent if the other side never got it. A load test opens and closes thousands of short connections, each one parks in TIME-WAIT for a minute on whichever side closed first, and the count is simply the closing rate times sixty. The sockets are tiny kernel records, not file descriptors, and they expire on their own. Wait two minutes and run ss -s again; the pile melts.

When does it actually matter? On the client side of high-churn traffic to a single destination. Every outbound connection needs a local ephemeral port, the pool is around 28,000 ports by default, and a port sitting in TIME-WAIT toward a given destination is not reusable for a new connection to that same destination until it expires. A service that opens a fresh connection per request to one upstream can exhaust the pool and start failing to connect at all — and the durable fix is connection pooling or keep-alive, not a sysctl. (net.ipv4.tcp_tw_reuse exists and helps for outbound connections; the once-popular tcp_tw_recycle was so unsafe behind NAT that the kernel removed it.) On the server side, where the rows above live, twelve thousand TIME-WAITs are bookkeeping, nothing more.

Recv-Q climbing on a LISTEN socket

A service starts timing out for a fraction of clients while the rest sail through. CPU looks fine. The listener row tells the story:

$ sudo ss -tlnp sport = :8080
State    Recv-Q   Send-Q   Local Address:Port   Peer Address:Port   Process
LISTEN   129      128      0.0.0.0:8080         0.0.0.0:*           users:(("app",pid=3110,fd=4))

Remember the LISTEN decoding: Recv-Q is connections the kernel has completed that the application has not yet accepted, Send-Q is the limit. 129 against 128 means the accept queue is full. The kernel is finishing handshakes on the app's behalf, parking the connections, and the app is not calling accept() fast enough to drain them — its accept loop is blocked on something, or its event loop is starved, or it genuinely cannot keep up. Once the queue is full, the kernel starts dropping or refusing new connection attempts, which is why some clients time out while accepted ones are served normally. You can confirm the overflow history with nstat -az TcpExtListenOverflows, a counter that only ever goes up.

This is back-pressure in its rawest form. The queue between the kernel and the application filled, and the excess load fell on whoever was upstream — the clients. Raising the backlog (the app's listen() argument, capped by net.core.somaxconn) buys burst absorption; it does not add capacity, it adds queueing delay before failure. The real fix is whatever stalled the accept path. The same reasoning, one layer up, is why every queue in a system needs a bounded size and a story for what happens when it fills.

What a socket is, underneath

Everything ss prints hangs off three ideas, and they are the same three you need for any networking interview question, so they are worth pinning down here.

First, a socket is a file descriptor. When a process calls socket() it gets back a small integer, the same kind of integer open() returns, indexing into the same per-process descriptor table that the lsof page dissects. That is why ss -p can print fd=89, why socket leaks exhaust the same "too many open files" limit as file leaks, and why a forked child can hold its parent's listening socket. The full anatomy — what the kernel object behind the descriptor contains, and how bind, listen, accept, and connect move it through its life — is on the sockets page.

Second, a connection is named by its five-tuple: protocol, local address, local port, peer address, peer port. That tuple is exactly what the Local and Peer columns print, and it is why one server port can carry thousands of simultaneous connections — each one differs in the peer's address or port, so each tuple is unique. It is also why TIME-WAIT pins a tuple rather than a port: the ephemeral port is only unavailable toward the destination that the dead connection pointed at.

Third, every TCP socket is a state machine, and the State column is its current node. A server socket starts in LISTEN. An arriving SYN spawns an embryonic connection in SYN-RECV; the handshake's final ACK promotes it to ESTABLISHED (and parks it in the accept queue from the previous section). Teardown is where the asymmetry lives: the side that closes first walks FIN-WAIT-1, then FIN-WAIT-2, then TIME-WAIT; the side that closes second sits in CLOSE-WAIT until the application gets around to calling close(), then passes through LAST-ACK. That detail is diagnostic gold: a pile of TIME-WAIT is normal churn, but a pile of CLOSE-WAIT is an application bug — the peer hung up and your code never closed its end, and those sockets are real descriptors that leak until the limit. The full machine with every edge, and the handshake you can step through packet by packet, are at TCP and the handshake simulator.

The server-side path through the TCP state machine, simplified. Close first and you earn TIME-WAIT; get closed on and you owe a close() to leave CLOSE-WAIT.

Pitfalls

Trusting a blank Process column. -p works by matching socket inodes against /proc/PID/fd, and you can only read other users' descriptor tables as root. Run ss -tlnp unprivileged and root-owned services show up with the Process column simply empty — the socket is listed, the owner is not, and it is easy to misread the silence as "no process attached." If you need the owner and the column is blank, rerun under sudo before drawing any conclusion.

netstat habits transferred literally. The flag letters mostly carry over — netstat -tlnp and ss -tlnp ask the same question — but the output does not. ss leads with State where netstat led with Proto, the queue columns sit in different positions, and the process annotation has a different shape (users:(("java",pid=41327,fd=89)) versus netstat's 41327/java). Any script or eyeball habit that grabs column five will silently grab the wrong field. Re-learn the layout once, on a calm day, rather than mid-incident.

Expecting UDP to have states. ss -tl shows TCP only; your DNS resolver, your StatsD client, and WireGuard are all invisible until you add -u. And UDP has no connections, so there is no LISTEN: a bound UDP socket shows as UNCONN, and only a UDP socket that called connect() (a resolver pinning its upstream, say) shows as ESTAB. The states you spent this page learning are TCP's; for UDP the interesting columns are just the queues and the addresses.

The shell eating your filters. The filter language uses comparison operators and parentheses — ss -tn dport = :443, or compound expressions grouped with parens — and the shell has opinions about both. Unquoted parentheses are a syntax error in your shell before ss ever runs; an unquoted ! triggers history expansion in interactive bash. Quote any filter more complex than a single comparison: ss -tn 'dport = :443 or dport = :80'.

Forgetting it is a snapshot. Like lsof, ss shows the instant it ran. A connection that opens and closes in fifty milliseconds will rarely be caught, and a queue depth is one sample, not a trend. For trends, sample in a loop (watch -n1 ss -s is often enough); for the packets themselves, that is tcpdump's territory.

A drill you can run right now

Everything below is safe on any Linux machine: it starts one throwaway web server in your own session, makes a couple of requests to it, and watches the socket move through the state machine you just read about. Ten minutes, and LISTEN, ESTABLISHED, the queue columns, and TIME-WAIT stop being vocabulary and become things you have watched happen.

Step 1 — the baseline. Run ss -tlnp and read every row against the column decoder above: what is bound to loopback versus everything, what the backlog capacities are, which processes you can see without sudo. Then run ss -s and note the totals, so you have a feel for what this machine looks like at rest.

Step 2 — make a listener and find it. Python's built-in file server is the perfect lab animal: one command, no install, listens on all interfaces.

$ cd /tmp && mkdir -p ssdrill && cd ssdrill
$ head -c 100M /dev/urandom > big.bin
$ python3 -m http.server 8000 &
[1] 7012
$ ss -tlnp sport = :8000
State    Recv-Q   Send-Q   Local Address:Port   Peer Address:Port   Process
LISTEN   0        5        0.0.0.0:8000         0.0.0.0:*           users:(("python3",pid=7012,fd=3))

There is your listener: bound to 0.0.0.0, owned by your own python3 so -p works without sudo, and with a Send-Q of 5 — Python's default listen() backlog, tiny compared to the 511s and 4096s from real servers, and now you know exactly what that number means on a LISTEN row.

Step 3 — catch ESTABLISHED with its buffers full. A fast request finishes before you can look, so make a deliberately slow one: download the 100 MB file with curl throttled to a crawl, then inspect the connection while it is alive.

$ curl -s --limit-rate 500k -o /dev/null http://127.0.0.1:8000/big.bin &
[2] 7044
$ ss -tnpi sport = :8000 or dport = :8000
State  Recv-Q     Send-Q     Local Address:Port    Peer Address:Port
ESTAB  0          2602340    127.0.0.1:8000        127.0.0.1:52114     users:(("python3",pid=7012,fd=4))
	 cubic wscale:7,7 rto:204 rtt:0.12/0.06 cwnd:10 send 9.6Gbps lastsnd:2 unacked:18
ESTAB  3145728    0          127.0.0.1:52114       127.0.0.1:8000      users:(("curl",pid=7044,fd=5))

Read the pair as the buffers diagram come to life. The server side has a swollen Send-Q: python3 has written megabytes that the other end has not acknowledged, because the other end will not take them any faster. The curl side has a swollen Recv-Q: the kernel has received and ACKed that data, but curl, rate-limited on purpose, is reading it slowly. Same connection, two sockets, and the queue columns point at exactly where the bytes are stuck. This is what a slow consumer looks like in production, manufactured on loopback. While it runs, glance at the -i line too — a sub-millisecond rtt, because the "network" here is a memory copy.

Step 4 — watch the corpse. Let the download finish (or kill %2), then look for what the connection left behind:

$ ss -tn state time-wait
Recv-Q  Send-Q     Local Address:Port     Peer Address:Port
0       0          127.0.0.1:8000         127.0.0.1:52114
$ sleep 60; ss -tn state time-wait | grep 8000
$ kill %1

One side of the finished connection is parked in TIME-WAIT — whichever side sent the first FIN, which for this server is usually the python3 side, so the row shows local port 8000. It is consuming no descriptor and no attention; it is the kernel guarding the five-tuple against ghosts of the old connection. Sixty seconds later it is gone, no action taken, which is precisely the argument for not panicking at the load-test pile. Kill the server, run ss -tlnp sport = :8000 one last time, and confirm the listener vanished with it.

If you remember one line. sudo ss -tlnp for "what is listening and as whom," ss -s for the shape of the box, and the rule that on LISTEN rows the queue columns mean accept-queue depth and limit — Recv-Q pressing against Send-Q there is an incident, not trivia.

ss

The question it answers

The five invocations that matter

Reading the output

Three production scenarios

"The service is unreachable" — is it even listening, and where?

Twelve thousand TIME-WAIT sockets after a load test

Recv-Q climbing on a LISTEN socket

What a socket is, underneath

Pitfalls

A drill you can run right now

Further reading

05 — strace