03 / 28

Linux / 03

lsof

Something is squatting on port 8080. The disk is full but du can only account for half of it. A service keeps dying with "too many open files." All three are the same question wearing different clothes: who has this thing open right now? That is the one question lsof answers, and because Linux treats sockets, pipes, and devices as files too, it answers it for all of them. This page covers the five flags worth memorising, decodes every column of the output, walks three production incidents, and ends with a drill you can run on any machine without breaking anything.

The question it answers

The name is literal: lsof lists open files. That sounds narrow until you remember what "file" means on Linux. A TCP socket is a file. A unix domain socket is a file. A pipe between two processes is a file. The terminal you are typing into, the shared libraries a process mapped into memory, the directory it is sitting in, the device node for your disk — files, all of them. Every one of these is reached through a file descriptor, and lsof is the tool that walks every process on the machine and reports every descriptor each one holds.

That single capability turns out to be the answer to a whole family of operational questions. Who is listening on port 8080? That is a process holding a socket open. Why can I not unmount this volume? Some process has a file or a working directory on it. Where did 80 GB of disk go that du cannot find? A process is holding a deleted file open, and the kernel will not free the blocks until it lets go. Why does this service crash with "too many open files"? Its descriptor count crept up to the limit, one leaked socket at a time. Different symptoms, same diagnostic: list the open files and look.

It helps to know what lsof is not. It is not a network sniffer; it shows you which process owns a connection, not what travels over it. It is not a snapshot of history; it shows the state of the machine at the instant it ran, and a short-lived process can open and close a file between two invocations without ever appearing. And it is not free: on a busy box it does a surprising amount of work, which matters later. But when the question is "who has this open, right now," nothing else gives you the same direct answer with the process name and PID attached. The companion tools each cover a slice — ss is faster for sockets, fuser is terser for a single file — but lsof is the one that covers everything with one mental model.

The five flags that matter

The man page for lsof is enormous, and nearly all of it is ignorable. Five flags cover the daily work, and one of them you should type by reflex every single time.

Flag	What it selects	When you reach for it
`-i :8080`	Network files, filtered by port, host, or protocol	Port conflicts, "what is listening," tracing a connection to its process
`-p 41327`	Everything one process has open	Descriptor leaks, auditing what a service touches
`-u deploy`	Everything one user's processes hold	Shared boxes, runaway cron jobs, "what is this account doing"
`+L1`	Files with a link count below one: deleted, but still open	Disk space that `df` sees and `du` cannot
`-nP`	Nothing — it skips DNS (`-n`) and port-name (`-P`) lookups	Always. Every invocation. See below.

The -i selector takes a small grammar: -i :8080 matches the port on any address, -i TCP:8080 narrows to one protocol, -i @10.0.4.12 matches a host, and -i TCP:8080 -sTCP:LISTEN narrows to sockets actually in the listening state, which is usually what you mean when you ask who owns a port. One subtlety worth knowing before it bites you: when you stack several selectors, lsof ORs them together by default. lsof -u deploy -i :8080 means "deploy's files, plus anything on port 8080," not the intersection. Add -a to switch the logic to AND: lsof -a -u deploy -i :8080 is "deploy's files that are on port 8080." Almost everyone learns this by staring at output that is mysteriously too long.

Why -nP is non-negotiable. Without -n, lsof does a reverse DNS lookup for every remote address it prints, and without -P it resolves every port number to a service name. On a machine with hundreds of connections and a slow or absent DNS resolver, those lookups serialise into what feels like a hang — the tool sits silent for thirty seconds while you wonder if the box is dying. It is not dying. It is resolving hostnames you did not ask for. lsof -nP prints raw numbers immediately, and raw numbers are what you want during an incident anyway.

Reading the output

Here is a realistic answer to "who is on port 8080," taken from the kind of box where a Java service sits behind an nginx proxy. Run it with sudo, because without root you only see your own processes — more on that in the pitfalls.

$ sudo lsof -nP -i :8080
COMMAND   PID   USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
java    41327 deploy  89u  IPv6 812644      0t0  TCP  *:8080 (LISTEN)
java    41327 deploy  92u  IPv6 815091      0t0  TCP  10.0.4.12:8080->10.0.9.55:49210 (ESTABLISHED)
nginx    1290   root  12u  IPv4 433190      0t0  TCP  127.0.0.1:46214->127.0.0.1:8080 (ESTABLISHED)

One row, every column. The FD column is the one that repays study; the rest you can read at a glance once you have seen them labelled.

Most of the columns explain themselves once you have seen them once. COMMAND is the process name, truncated to nine characters by default (widen it with +c 0 if the truncation hides what you need). PID and USER are the process id and the account the process runs as — the account that owns the process, which is not necessarily the account that owns the file. TYPE tells you what kind of file this is: REG for a regular file, DIR for a directory, CHR for a character device, FIFO for a pipe, unix for a unix domain socket, IPv4 and IPv6 for network sockets. DEVICE identifies the device or socket in kernel terms. SIZE/OFF is the file's size or the descriptor's current offset; sockets show 0t0 because the concept does not apply. NODE is the inode number for filesystem objects and the protocol name for sockets. NAME is the payoff: the path for files, and for network sockets the full local->remote address pair with the connection state in parentheses.

The FD column is the one nobody teaches, and it is where the real information lives. It is not always a number. lsof also uses it to report things a process has open that are not descriptors at all, and when it is a number, the letter glued to the end tells you the access mode.

FD entry	What it means
`cwd`	The process's current working directory. This alone can pin a filesystem and block an unmount.
`rtd`	The process's root directory (interesting for chrooted or containerised processes)
`txt`	Program text: the executable file itself
`mem`	A memory-mapped file, most often a shared library
`89r`	Descriptor number 89, open for reading only
`89w`	Descriptor 89, open for writing only — log files usually look like this
`89u`	Descriptor 89, open for both read and write — sockets usually look like this

Two practical reads fall out of this decoder. First, when you are counting descriptors for a leak investigation, only the numeric rows are actual descriptors; cwd, txt, and the mem rows are not, so piping lsof -p into wc -l overcounts. Second, the mode letter is a clue about intent: a process holding a log file with 4w is appending to it, and a process holding a socket with 92u is talking on it. Occasionally you will also see a lock indicator after the mode letter, such as 4wW for a held write lock — useful when two processes are fighting over a lock file and you want to know who won.

Three production scenarios

"Address already in use" on deploy

The deploy fails, the service will not start, and the log says bind: address already in use. Something already owns the port. Maybe the old instance never died, maybe a debug process from last week is still attached, maybe an orphaned child survived a restart because children inherit their parent's descriptors across fork(). You do not need to guess:

$ sudo lsof -nP -iTCP:8080 -sTCP:LISTEN
COMMAND   PID   USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
java    38104 deploy   89u  IPv6 798112      0t0  TCP  *:8080 (LISTEN)

One line, and you have the culprit's name, PID, and owner. From there it is judgement, not tooling: is PID 38104 the previous release that the supervisor failed to reap, or something that legitimately holds the port? Check with ps -fp 38104 before you reach for kill. The narrower -sTCP:LISTEN filter matters here because without it you also get every established connection touching port 8080, and during an incident the extra rows are noise. The full decision tree for this incident, including the cases where nothing appears to be listening and yet the bind still fails, lives in what's holding this port? — and if you only need the socket-side view on a box where every second counts, ss gets the same answer faster.

Disk full, but du disagrees

df says the volume is at 96%. You run du on every directory and the sum comes nowhere close. This is the classic deleted-but-open file: a process opened a log, something (often logrotate, sometimes a tidy-minded human) deleted the file, and the process kept writing to it. Deleting a file removes its name from the directory. The inode and its data blocks stay allocated until the last open descriptor closes. du walks names, so it cannot see the space; df asks the filesystem for allocated blocks, so it can. The gap between them is your missing disk.

$ sudo lsof -nP +L1
COMMAND   PID   USER   FD   TYPE DEVICE   SIZE/OFF NLINK   NODE NAME
java    41327 deploy   4w   REG  259,1 84817930240     0 524291 /var/log/app/server.log (deleted)

+L1 means "files with a link count less than one" — zero names left, but still open. The NLINK column appears just for this query, the size column shows the 84 GB you were hunting, and NAME ends with (deleted). The fix is rarely "kill the process." You can truncate the file through the descriptor without restarting anything: : > /proc/41327/fd/4 empties it in place and the space comes back immediately. Then fix the rotation config so it signals the process instead of deleting files out from under it. The wider investigation, including the other ways a disk fills invisibly, is the subject of why is the disk full?

unlink removes the name; the inode and its blocks survive as long as any descriptor holds them. The gap between df and du lives in panel 2.

The slow descriptor leak

A service that has run fine for weeks starts throwing EMFILE: too many open files. Each process has a descriptor limit (ulimit -n, commonly 1024 or 65536), and something in the code path opens sockets or files without closing them — an HTTP client that never releases connections on the error path is the usual suspect. The diagnostic is to watch the count grow:

$ sudo lsof -nP -p 41327 | wc -l
2741
$ sleep 300; sudo lsof -nP -p 41327 | wc -l
3088

A count that climbs steadily under constant load is a leak. Remember the overcount caveat from the FD decoder: lsof -p includes mem and cwd rows that are not descriptors, so for a precise number ask the kernel directly with ls /proc/41327/fd | wc -l. Then look at what is leaking rather than how much: ls -l /proc/41327/fd prints every descriptor as a symlink to its target, and the pattern jumps out — hundreds of links to socket:[918432] means leaked connections, hundreds to the same file path means a missing close() in a retry loop. Group the lsof output by TYPE and NAME and the offending code path usually names itself. Raising ulimit -n buys time; it does not fix a leak, it reschedules the outage.

What lsof actually reads

There is no magic in lsof, and knowing where its data comes from makes the output easier to trust. Every process on Linux owns a file descriptor table: a per-process array, indexed by small integers, where each slot points at an open file description in the kernel — which in turn points at an inode, a socket, a pipe, or a device. Descriptor 0 is standard input, 1 is standard output, 2 is standard error, and everything the process opens after that takes the next free slot. When your code calls open() or socket(), the integer it gets back is nothing more than an index into this table.

The per-process descriptor table. Small integers on the left, kernel objects on the right. lsof's job is rendering this mapping for every process at once.

The kernel exposes that table through the /proc filesystem. /proc/41327/fd/ is a directory containing one symlink per open descriptor, each pointing at its target: a path for regular files, socket:[815091] for sockets, pipe:[812001] for pipes. lsof is, to a first approximation, a program that walks /proc/*/fd for every process, reads /proc/PID/maps for the memory-mapped files, joins the socket inode numbers against the tables in /proc/net/tcp and friends to recover addresses and states, and formats the result. You can verify this yourself: ls -l /proc/$$/fd shows your own shell's table, no tooling required, and during a bad incident when lsof is not installed, raw /proc spelunking gets you most of the same answers.

This is also where descriptor inheritance comes from. fork() copies the parent's descriptor table into the child, which is why a child process can hold a listening socket its parent opened, and why "I killed the server but the port is still taken" usually means a forked worker survived. The deeper anatomy of processes and their tables is covered in processes, the /proc filesystem gets its own page at /proc, the inode-and-link-count machinery behind the deleted-file trick lives in file systems, and what actually happens when a process reads or writes through one of these descriptors is the subject of I/O.

Pitfalls

Forgetting -nP. Covered above, but it earns a second mention because it is the most common way the tool wastes your time. If lsof appears to hang, it is almost certainly resolving hostnames. Ctrl-C, add -nP, run it again.

Running it without root and trusting the silence. An unprivileged lsof can only inspect your own processes, because reading another user's /proc/PID/fd requires permission you do not have. The dangerous part is that the output is not an error; it is a shorter list. You ask who is on port 8080, get nothing back, and conclude the port is free while a root-owned process sits on it invisibly. If the question involves any process you do not own — and during an incident it nearly always does — run it under sudo, and treat an empty answer from an unprivileged run as "no answer," not "no."

Expecting it to be fast on a big box. A bare lsof with no selectors enumerates every descriptor of every process: on a host running thousands of processes with tens of thousands of descriptors each, that is real work and real time. Worse, stat-ing files on a hung NFS mount can block the whole run. Narrow the query with selectors (-p, -i, -u) so it reads only the slice of /proc you care about, and reach for -b to avoid blocking kernel calls if flaky network mounts are part of your life.

Forgetting fuser exists. For two narrow questions, fuser is quicker to type and quicker to run: fuser -v /var/log/app/server.log lists the PIDs holding one specific file, and fuser -vm /data lists everything keeping a mount point busy, which is exactly what you want when umount says the target is in use. It prints far less detail than lsof, and that is the point. Know both; use the small one when the question is small.

Treating the output as a recording. lsof is a snapshot. A process that opens, reads, and closes a file in fifty milliseconds will almost never be caught by it. If you need to know who touches a file over time rather than who holds it open right now, that is a tracing problem, not a listing problem.

A drill you can run right now

Everything below is safe on any Linux machine, including a shared one: it inspects state and creates one throwaway file in /tmp. Ten minutes, and the three big ideas — the port view, the descriptor table, and the deleted-but-open inode — stop being trivia and become things you have seen.

Step 1 — the network view. List every network file your account can see, with lookups off: lsof -nP -i. Pick one row and read it column by column against the decoder above: who owns it, which descriptor, what state. If you have sudo, run it again with sudo and notice how much longer the list gets — that difference is the unprivileged-silence pitfall made visible.

Step 2 — your own shell's table. Run lsof -p $$ (the shell expands $$ to its own PID). Find cwd (the directory you are sitting in), txt (the shell binary itself), the mem rows (libc and friends), and descriptors 0, 1, and 2 all pointing at your terminal device. Then cross-check against the kernel directly with ls -l /proc/$$/fd and confirm the numeric rows match.

Step 3 — make a ghost file and catch it. Create a file, hold it open with tail -f, delete it, and watch it live on:

$ cd /tmp && echo "hold me" > demo.txt
$ tail -f demo.txt &
[1] 7012
$ rm demo.txt
$ lsof -nP +L1 | grep demo
tail     7012 nilesh    3r   REG  259,1        8     0  524300 /tmp/demo.txt (deleted)
$ cat /proc/7012/fd/3
hold me
$ kill %1

Walk through what just happened. rm removed the name, so the file vanished from ls and from anything du would count. But tail still holds descriptor 3 on the inode, so lsof +L1 finds it, NLINK reads zero, and NAME says (deleted). Better still, cat /proc/7012/fd/3 reads the file's contents back after deletion — the same trick that lets you recover a log someone deleted from under a running service, and the same mechanism that hides 84 GB on a production volume. When you kill %1, the last descriptor closes and the kernel finally frees the inode. That is the entire deleted-but-open story, performed on a file eight bytes long instead of a pager at 3am.

If you remember one line. sudo lsof -nP -i :PORT for "who owns this port," sudo lsof -nP +L1 for "where did the disk go," and ls -l /proc/PID/fd when you want the kernel's answer with no tool in between.

lsof

The question it answers

The five flags that matter

Reading the output

Three production scenarios

"Address already in use" on deploy

Disk full, but du disagrees

The slow descriptor leak

What lsof actually reads

Pitfalls

A drill you can run right now

Further reading

04 — ss