ip
You are on a box you have never seen before and a service on it cannot reach something it
should. Before any packet capture, before any firewall theory, two questions come first:
what addresses does this machine actually have, and which way will its packets leave?
The ip command answers both, and it replaced four older tools to do it. This page
covers the five invocations worth knowing cold, reads the output line by line, walks three
production incidents, and ends with a drill that touches nothing.
The question it answers
Every network problem on a Linux box eventually reduces to a chain: does this machine have an
address, does it have a route to the destination, can it resolve the next hop's hardware
address, and is the interface actually up and passing frames. The ip command is
the front door to all four. It reads and edits the kernel's view of network identity: the
interfaces, the addresses bound to them, the routing table that decides where each packet
goes, and the neighbour table that maps next-hop IPs to MAC addresses.
Historically that view was scattered across four tools. ifconfig showed
interfaces and addresses, route (and netstat -r) showed the routing
table, and arp showed the neighbour cache. All of them came from the
net-tools package, all of them speak an old kernel interface, and all of them
were replaced by the iproute2 suite, of which ip is the main
binary. This is not a cosmetic renaming. The old tools predate features the kernel has had
for decades: multiple addresses on one interface shown properly, multiple routing tables,
address scopes, modern interface types. ifconfig on a current machine can show
you an incomplete picture and look perfectly confident doing it, which is worse than showing
nothing. The pitfalls section has the details; the short version is that on any box you care
about, ip is the tool and the others are history.
What ip does not do is also worth a sentence. It does not show you sockets or
which process owns a connection; that is ss. It
does not capture traffic; that is tcpdump. It tells you what the box would do with a
packet, not what it did do with one. In the standard incident sequence it comes
first: establish identity and routing with ip, then move up to sockets and
captures once you know the ground floor is sane. The wider decision tree for "is it the
network at all" lives in
is it the network?
The five invocations that matter
Like most of iproute2, ip has an enormous surface. It can build tunnels, manage
policy rules, and configure things you will never touch outside a network team. For
diagnosis, five invocations cover the daily work. All of them are read-only as written here,
so none of them needs root and none of them can break anything.
| Invocation | What it shows | When you reach for it |
|---|---|---|
ip addr | Every interface with its addresses, CIDR masks, scopes, and state | First. Does this box have the address you think it has, on the subnet you think it is on? |
ip route | The routing table: prefixes, next hops, interfaces, metrics | Second. Where do packets leave, and is there a default route at all? |
ip route get 8.8.8.8 | The exact route, interface, and source address the kernel picks for one destination | The killer move. Stop inferring from the table; ask the kernel directly. |
ip link / ip -s link | Layer-2 state: up or down, MTU, MAC, and with -s the packet and error counters | When addresses and routes look right but traffic still dies, or large payloads fail. |
ip neigh | The neighbour table (the ARP cache): next-hop IP to MAC mappings and their state | When the next hop itself is in doubt: FAILED entries mean nobody answered for that address. |
One more belongs on the list as a habit rather than a diagnostic: ip -br addr
(and its sibling ip -br link). The -br flag means brief, and it
collapses the full output into one line per interface: name, state, addresses. It is the
view you want ninety percent of the time, and it is worth an alias.
$ ip -br addr lo UNKNOWN 127.0.0.1/8 ::1/128 eth0 UP 10.0.4.12/24 fe80::858:aff:fe00:40c/64 eth1 UP 192.168.50.7/24 docker0 DOWN 172.17.0.1/16
Four lines, and you already know a lot: this box has a loopback, a primary interface on the
10.0.4.0/24 subnet, a second interface on a 192.168.50.0/24 network, and a Docker bridge
that is down. Note the state column: UNKNOWN on loopback is normal (loopback
has no carrier concept), UP means the link is administratively up and has
carrier, and DOWN on docker0 just means no container is attached. If a
physical interface you depend on says DOWN, you can stop reading routing tables
and go look at the cable, the switch port, or the hypervisor.
alias ipb='ip -br -c addr' gives you the
brief view with colour. On a strange box during an incident, ip -br addr then
ip route is a ten-second ritual that answers half the questions you came with.Reading the output
ip addr, line by line
The full ip addr output looks dense the first time, but every field is there for
a reason and most of them matter during a real incident. Here is a typical two-interface box,
trimmed to loopback and the primary NIC.
$ ip addr 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 0a:58:0a:00:04:0c brd ff:ff:ff:ff:ff:ff inet 10.0.4.12/24 brd 10.0.4.255 scope global dynamic eth0 valid_lft 2856sec preferred_lft 2856sec inet6 fe80::858:aff:fe00:40c/64 scope link valid_lft forever preferred_lft forever
Take eth0 from the top. The angle-bracket flags are the link's properties:
UP means an administrator (or the boot config) enabled the interface, and
LOWER_UP means the physical layer below it has carrier, a live cable or a happy
virtual NIC. You want both. An interface that is UP without
LOWER_UP is configured but disconnected, like a lamp that is switched on with no
bulb. The mtu 1500 is the largest payload this link will carry in one frame,
which becomes the whole story in the third scenario below. state UP repeats the
operational verdict, and qlen 1000 is the transmit queue length, the buffer of
packets waiting for the hardware. You rarely act on qlen, but knowing it is a queue explains
why an overloaded link adds latency before it drops anything.
The inet line is where identity lives. 10.0.4.12/24 is the address
and the subnet mask in one token: this box is 10.0.4.12, and it considers everything
in 10.0.4.0 through 10.0.4.255 to be directly on its own network, reachable without a
router. Read the suffix every time. A box configured as /24 when the network is
really a /25 will try to talk directly to neighbours that are actually behind a
router, and the failure looks like random unreachability rather than a typo.
scope global means the address is valid for talking to the wider world.
Loopback's address has scope host: valid only inside this machine. The IPv6
fe80:: address has scope link: valid only on this physical segment,
never routed. The word dynamic tells you DHCP assigned this address, and the
valid_lft 2856sec below it is the lease countdown. A static address says
nothing there and lives forever.
ip route, and the one rule that explains it
$ ip route default via 10.0.4.1 dev eth0 proto dhcp metric 100 10.0.4.0/24 dev eth0 proto kernel scope link src 10.0.4.12 metric 100 10.8.0.0/16 via 10.0.4.200 dev eth0 proto static 192.168.50.0/24 dev eth1 proto kernel scope link src 192.168.50.7
Each line is a rule: packets for this prefix go out this interface, either
directly or via a next hop. The default line is the catch-all, equivalent to
0.0.0.0/0: anything not matched by a more specific line goes to the router at 10.0.4.1. The
10.0.4.0/24 ... scope link line says the local subnet is reached directly, no
router involved, and src 10.0.4.12 records which source address the kernel will
stamp on packets using this route. The 10.8.0.0/16 line is a static route someone added,
pointing a private range at a different gateway on the same segment. The proto
field is provenance, not behaviour: kernel means the route appeared
automatically when the address was configured, dhcp and static
mean what they say. metric breaks ties when two routes cover the same prefix;
lower wins.
The rule that makes the whole table make sense is longest prefix match: for each packet, the kernel picks the matching route with the most specific prefix, and specificity beats everything else. A /24 beats a /16 beats the default, no matter what order the lines appear in. That single sentence is most of routing.
ip route get: ask, don't infer
You can run that match in your head, and on a four-line table you will get it right. On a
real machine, with VPN routes and container bridges and a cloud agent injecting things, you
will eventually get it wrong. So don't do it in your head. ip route get hands a
hypothetical destination to the kernel and prints the decision it would make, using exactly
the lookup the real packet will get.
$ ip route get 8.8.8.8 8.8.8.8 via 10.0.4.1 dev eth0 src 10.0.4.12 uid 1000 cache $ ip route get 192.168.50.40 192.168.50.40 dev eth1 src 192.168.50.7 uid 1000 cache
Three answers per line, and each one can be the bug. The interface (dev): is
traffic leaving the way you assumed? The next hop (via, absent when the
destination is on a local subnet): is it the router you expected? And the source address
(src): is the kernel stamping the address the far end will accept? That last
one bites multi-homed boxes constantly, because firewalls and allow-lists are written in
terms of source addresses, and the kernel chooses the source from the route, not from your
intentions. ip route get is the single highest-value command on this page.
It turns "I think it should go out eth0" into a fact, in one line, with no packets sent.
Three production scenarios
"The service is unreachable"
A deploy goes out, health checks fail, and the load balancer reports the backend
unreachable. The temptation is to start at the top: application logs, TLS, DNS. Start at the
bottom instead, because the bottom takes thirty seconds to clear. SSH in (if you can SSH in,
layer 3 works at least for your path, which is itself information) and run
ip -br addr. Is the address the load balancer is targeting actually bound on
this box? Cloud reprovisioning, DHCP lease changes, and copy-pasted netplan files all
produce machines whose real address is not the one in the service registry. Then read the
mask: an address of 10.0.4.12/16 on a network that is really carved into
/24s means this box believes the entire 10.0.x.x space is local and will ARP for addresses
it should be routing to. The symptom is maddeningly partial: same-subnet neighbours work,
everything else times out.
Next, ip route. Is there a default route at all? A box that boots with a
misconfigured gateway serves local traffic perfectly and drops everything else, which on a
segmented network can look like "the service is up but flaky" for hours. Then
ip neigh for the gateway's entry: REACHABLE or
STALE are both fine (stale just means not recently confirmed), but
FAILED means the box asked who has the gateway's address and nobody answered,
and now you are debugging layer 2, not your service.
$ ip neigh 10.0.4.1 dev eth0 lladdr 0a:58:0a:00:04:01 REACHABLE 10.0.4.33 dev eth0 lladdr 0a:58:0a:00:04:21 STALE 10.0.4.99 dev eth0 FAILED
Traffic leaving the wrong interface
A database box has two NICs: eth0 on the application network with the default route, eth1 on a dedicated replication network. Replication is slow, and the network team says replication traffic is showing up on the application network where it has no business being. Why would the kernel do that? Because routing is per-destination, not per-intention. If the replica's address is on 192.168.50.0/24 and the route for that prefix exists on eth1, fine. But if someone re-addressed the replica, or the eth1 route was never added on this box, the only matching route is the default, and the default goes out eth0. The kernel is not wrong; the table is.
The diagnosis is one command, run on the actual box, with the actual replica address:
ip route get 192.168.50.40. If the answer says dev eth1, routing
is innocent and you look elsewhere. If it says via 10.0.4.1 dev eth0, you have
your answer, and the src field gives you the second half of the story: traffic
arriving at the replica from 10.0.4.12 instead of 192.168.50.7, which the replica's firewall
may simply drop. This same shape produces asymmetric routing, where the request leaves one
interface and the reply tries to come back another, and stateful firewalls in the middle
drop the half they never saw. Whenever a multi-homed box behaves strangely, run
ip route get for the destination in both directions before forming a theory.
Small requests work, large ones hang
The strangest one in the family. Health checks pass, small API calls succeed, and then a file upload or a chunky JSON response hangs forever. SSH works but printing a large file freezes the session. The pattern to recognise is size-dependent failure, and the cause is usually MTU: somewhere on the path, a link carries less than the 1500 bytes everyone assumes, often because a VPN or an overlay network (WireGuard, IPsec, VXLAN in a container cluster) spends 50 to 80 bytes of every frame on its own encapsulation header. Packets small enough to fit pass; packets that do not are dropped, and if the router's "too big" error messages are firewalled off somewhere (they often are), the sender never learns why. The connection does not reset. It just stops.
ip link shows each interface's MTU directly, and an overlay interface
advertising 1420 or 1450 next to a physical NIC at 1500 tells you encapsulation is in play.
The counters from -s add the second clue:
$ ip -s link show eth0 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT group default qlen 1000 link/ether 0a:58:0a:00:04:0c brd ff:ff:ff:ff:ff:ff RX: bytes packets errors dropped missed mcast 48217553211 41273882 0 1204 0 0 TX: bytes packets errors dropped carrier collsns 9385527140 22094471 0 0 0 0
Errors mean corrupted frames, a bad cable or NIC. Drops mean the kernel received frames and
threw them away, usually buffer pressure. A small, static drop count is life; a count that
climbs while you watch is a finding. For the MTU question specifically, the confirmation
test is a ping with fragmentation forbidden: ping -M do -s 1472 host sends a
packet that needs the full 1500 (1472 of payload plus 28 of headers). If that fails while
-s 1392 succeeds, something on the path tops out near 1420, and you have found
your overlay. The fix is to lower the MTU on the relevant interface or fix the path; the
diagnosis is the part ip gives you.
What's underneath
Everything ip prints is a view over three kernel structures, and the chain
between them is the chain every outgoing packet walks. First, the interface list: each NIC,
bridge, tunnel, or loopback is a kernel object with a state, an MTU, a hardware address, and
a set of IP addresses bound to it. ip link reads the object, ip addr
reads the object plus its addresses.
Second, the routing table. When a process sends to a destination, the kernel runs the longest-prefix-match lookup you saw in the diagram and gets back a verdict: which interface, whether a gateway is involved, and which source address to use. This happens for every destination, on every box, including ones you do not think of as routers; "routing" is not something that only happens in the network core. How addresses and prefixes work is covered in IP, and how routers stitch these per-hop decisions into a path across the internet is routing.
Third, the neighbour table. A route names the next hop by IP address, but an Ethernet frame
needs a MAC address to leave the building. ARP (and its IPv6 successor, neighbour discovery)
resolves one to the other, and the kernel caches the answers in the table that
ip neigh prints, complete with a freshness state per entry. When a destination
is on the local subnet the kernel ARPs for the destination itself; when it is remote, it
ARPs for the gateway and never for the destination, which is why a box can talk to the whole
internet while holding MAC addresses for exactly one device. The full descent from IP packet
to electrical signal lives in
bytes on the wire.
One boundary worth marking: this entire page is about packets, not connections. The routing lookup neither knows nor cares whether a packet belongs to a TCP stream; connections are a socket-layer idea, built on top of this machinery and visible through different tools. Where sockets pick up the story is sockets, and the tool for inspecting them is ss.
Pitfalls
Trusting ifconfig on a modern box. The old tool does not just look dated;
it can mislead. ifconfig historically shows one IPv4 address per interface, so
a box with secondary addresses (common with virtual IPs and failover setups) shows you the
first and hides the rest. It knows nothing about modern interface types or multiple routing
tables, and on many distributions it is not installed at all, so the muscle memory fails
exactly when you are on an unfamiliar machine under pressure. If a teammate's
ifconfig output disagrees with your ip addr output, believe
ip. They are reading different interfaces to the same kernel, and only one of
them speaks the current one.
Assuming there is only one routing table. There is not. Linux supports many
routing tables plus a rule layer that picks between them based on source address, firewall
marks, and more; this is policy routing, and VPN clients, container runtimes, and cloud
multi-NIC setups all use it. The practical consequence: ip route shows only the
main table, so a VPN can be steering all your traffic while the main table looks untouched.
You do not need to operate policy routing from this page, just to know the trapdoor exists:
ip rule lists the rules, and ip route show table all shows
everything. ip route get, helpfully, evaluates the whole stack of rules, which
is one more reason it beats reading tables by hand.
Fighting the network manager. On most modern systems a daemon owns network
configuration: NetworkManager on desktops and many servers, systemd-networkd or netplan
elsewhere, cloud-init on first boot. If you change an address or route by hand with
ip while one of these is active, the daemon may put things back on its next
renewal or reconfiguration pass, minutes or hours later. The result is a fix that silently
un-fixes itself, which is far more confusing than no fix at all. Find out who owns the config
before editing live state, and make the change in that system's terms.
Forgetting that ip changes do not survive a reboot. Related but distinct:
everything ip sets is kernel state, and kernel state dies with the kernel. An
address added with ip addr add or a route added with ip route add
is gone after a reboot unless it is also written into the persistent config (netplan,
NetworkManager profiles, networkd units). The classic failure is the emergency static route
added during an incident that evaporates during a maintenance reboot three months later,
reopening the incident for whoever is on call that night. If you change live state, write it
down and persist it the same day.
A drill you can run right now
Everything below is read-only. It changes nothing, needs no root, and works on any Linux machine, including production. Ten minutes, and the address-route-neighbour chain stops being a diagram and becomes something you have walked on a real box.
Step 1, identity. Run ip -br addr. Count the interfaces and
account for each one: loopback, the primary NIC, and whatever else lives there (a Docker
bridge, a VPN tunnel, a virtual NIC). For the primary interface, read the address and say
the subnet out loud from the mask: a /24 means the last octet varies, a /20 means the box
considers a 4096-address block local. If you cannot derive the subnet from the suffix, that
is the gap to close first, because every routing question depends on it.
Step 2, the table. Run ip route. Find the default line and
name the gateway. Find the scope link line that matches your primary address
and notice the src on it. For every other line, try to say in one sentence why
it exists; on a laptop with Docker and a VPN, the answers ("that is the container bridge",
"that is the corporate range through the tunnel") are a tour of everything networking on
your machine.
Step 3, two predictions. Before running anything, predict the route for two destinations: one on your local subnet and one out on the internet. Then check both:
$ ip route get 8.8.8.8 8.8.8.8 via 10.0.4.1 dev eth0 src 10.0.4.12 uid 1000 cache $ ip route get 10.0.4.1 10.0.4.1 dev eth0 src 10.0.4.12 uid 1000 cache
Read the difference: the internet destination has a via, the local one does
not, because the kernel reaches the local subnet directly. If either answer surprises you
(a destination you thought was local goes via the gateway, or traffic picks an interface
you did not expect), you have just learned something true about your network that you did
not know, which is the entire point of the exercise. Follow up with ip neigh
and find your gateway's entry with its MAC address: that is the one device your box
actually talks to for everything beyond the local subnet.
Step 4, the link layer. Run ip -s link show for your primary
interface. Read the MTU and check it is what you expect (1500 on plain Ethernet, lower on
tunnels and overlays). Then read the RX and TX counters: bytes, packets, errors, drops.
Run it again a minute later and compare. On a healthy box the byte counters climb and the
error counters do not; if you ever need to argue "the NIC is fine" or "the NIC is not
fine" during an incident, this pair of snapshots is how the argument ends.
ip -br addr for who this box is,
ip route for where its packets go, and ip route get DEST when you
need the kernel's answer for one destination instead of your own guess.Further reading
- ip-route(8)
— the route subcommand's manual page, including the full grammar of
ip route getand the table selectors. - ip-address(8)
— addresses, scopes, and lifetimes, with the exact meaning of every flag in the
ip addroutput. - iproute2 cheat sheet — Daniil Baturin — the best single reference for translating intentions into iproute2 invocations, including the policy routing corners.
- Semicolony — Routing — how the per-hop decision this page describes becomes a path across networks.