dig
Half of "the network is broken" is DNS. The service is fine, the route is fine, and yet
the client is connecting to an address that stopped being correct an hour ago, because
some cache somewhere is still handing it out. Every name lookup is a question put to a
specific resolver, and that resolver is allowed to answer from memory. So the debugging
question is never just "what does this name resolve to" — it is what does DNS
actually say, and which resolver said it? That is the question dig
answers. This page covers the five usages worth knowing, decodes every section of the
output, walks three production incidents, and ends with a drill that touches nothing.
The question it answers
Most tools resolve names as a side effect. curl resolves a name on the way
to making a request, ping resolves one on the way to sending a packet, and
when resolution misbehaves they report it badly or not at all. dig inverts
this: the lookup is the entire event. It builds a DNS query, sends it to a resolver,
and prints the full response, every section, every flag, every TTL, plus a footer
telling you exactly which server answered and how long it took. Nothing is summarised
away. When DNS is the suspect, that completeness is the point.
The reason DNS needs a dedicated interrogation tool is that there is no single "the
DNS." A name lookup is a question put to one particular resolver, and different
resolvers can legitimately give different answers at the same moment. Your laptop asks
the resolver in /etc/resolv.conf, which is usually a cache. That cache
asked an upstream recursive resolver, which is also a cache. The recursive resolver
asked the authoritative servers for the zone, which hold the actual records. Change a
record at the authority and the truth ripples outward at the speed of cache expiry,
not at the speed of light. Every "DNS propagation" mystery, every "it works on my
machine but not in the pod" lookup bug, every stale-IP incident comes down to two
questions: which resolver did this client ask, and what is that resolver currently
holding? dig lets you put the same question to any resolver you like and
compare the answers.
It helps to know what dig is not. It is not what your application does. An
app calling getaddrinfo() goes through the C library's name machinery,
which consults /etc/nsswitch.conf, /etc/hosts, and possibly
systemd-resolved before any DNS packet exists. dig skips all
of that and speaks raw DNS to a server. Most of the time the two agree; the times they
do not are a pitfall with its own section below. It is also not a packet capture: it
shows you the response it received, not what crossed the wire to get it. When you need
to see the actual UDP packets, queries that never get answered, or a middlebox
rewriting responses, that is a job for
tcpdump with port 53.
The five usages that matter
dig takes dozens of options and you will use about five shapes of
invocation for nearly everything. Each one varies a different part of the question:
which name, which record type, and most importantly, which server gets asked.
| Invocation | What it asks | When you reach for it |
|---|---|---|
dig example.com | The A record, from your default resolver | The baseline: what does this machine's resolver say right now |
dig @8.8.8.8 example.com | The same question, put to a specific server | Comparing resolvers, bypassing a suspect local cache, asking the authority directly |
dig +trace example.com | The full delegation walk: root, then TLD, then authoritative | Delegation bugs, "is the registrar pointing at the right nameservers" |
dig example.com MX | A different record type: AAAA, CNAME, MX, TXT, NS, SOA… | Mail routing, alias chains, ownership verification, zone metadata |
dig +short / dig -x 93.184.215.14 | Just the answer / the reverse lookup for an IP | Scripting and quick checks; naming an address found in logs or ss output |
The bare form deserves one unpacking, because every part of dig example.com
is a default you should be able to name. No record type means type A, the IPv4 address.
No class means IN, internet, which is the only class anyone uses. And no server means
dig reads /etc/resolv.conf, takes the first
nameserver line, and sends the query there. That last default is the one
that bites: the answer you get is whatever that one resolver currently
believes, cache and all. The bare form tells you what this machine sees. It does not
tell you what the rest of the world sees.
That is what @ is for. dig @8.8.8.8 example.com sends the
identical question to Google's public resolver instead, ignoring
resolv.conf entirely. dig @1.1.1.1 asks Cloudflare's.
And, the move that settles most arguments, dig @ns1.your-dns-host.net asks
one of the zone's own authoritative servers, the machines that hold the records rather
than cache them. Authoritative servers do not guess and do not cache other people's
data; what they return is the record as published. The triangle of local resolver,
public resolver, and authority is the whole diagnostic method, and the first production
scenario below is nothing but that triangle.
+trace changes the mode entirely. Instead of asking a recursive resolver
to do the work, dig does the recursion itself in front of you: it asks a
root server, which replies "the .com servers are over there," asks a
.com server, which replies "the example.com servers are over
there," and asks one of those, which finally returns the record. You see every
referral, which makes it the tool for delegation problems: NS records that point at
the wrong host, a registrar update that never took, a child zone nobody delegated.
Note what it deliberately is not: a view of any cache. More on that in the pitfalls.
Record types are the third axis. A and AAAA are the IPv4 and
IPv6 addresses. CNAME says "this name is an alias for that name," and
chains of them are how CDNs are usually wired in. MX lists the mail
exchangers for a domain, with preference numbers. TXT holds free-form
text, which in practice means SPF and DKIM policies and the verification strings every
SaaS vendor asks you to publish. NS lists the zone's authoritative
nameservers. SOA is the zone's metadata record: serial number, refresh
timers, and the field that controls negative caching. Asking for
ANY used to be the lazy way to see everything; most servers now refuse it
or return a minimal answer, so ask for what you want by name.
Finally the two conveniences. +short strips the response to bare answers,
one per line, which is what you want in scripts and one-glance checks; everything this
page says about reading the full output is the argument for not using it while
debugging. -x does a reverse lookup, turning an IP back into a name by
querying the PTR record under in-addr.arpa. Reverse zones are maintained
by whoever owns the address block, not by whoever owns the forward name, so a missing
or mismatched PTR is common and usually harmless, except to mail servers, which take
PTR records personally.
Reading the output
Here is a complete, unabridged response. dig's output looks noisy the
first dozen times, but it has exactly five parts, and each one answers a different
question. Learn the parts once and the noise becomes a report.
$ dig shop.example.com ; <<>> DiG 9.18.24 <<>> shop.example.com ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 23198 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 1232 ;; QUESTION SECTION: ;shop.example.com. IN A ;; ANSWER SECTION: shop.example.com. 287 IN A 198.51.100.7 ;; Query time: 2 msec ;; SERVER: 192.168.1.1#53(192.168.1.1) (UDP) ;; WHEN: Mon Jun 08 10:14:02 IST 2026 ;; MSG SIZE rcvd: 61
The header. The line starting ->>HEADER<<- is
the verdict. status: is the response code, and three values cover almost
everything you will meet. NOERROR means the query was answered without
complaint; note that it does not guarantee an answer record exists. A name can
be real but have no record of the type you asked for, in which case you get
NOERROR with ANSWER: 0, a combination worth recognising on
sight. NXDOMAIN means the name does not exist at all, no records of any
type, says the authority. SERVFAIL means the resolver tried and could not
get a usable answer: the authoritative servers were unreachable, or DNSSEC validation
failed, or recursion broke somewhere. The blame is different in each case.
NXDOMAIN points at the zone's contents; SERVFAIL points at
the resolution path; REFUSED, the fourth one you will occasionally see,
means the server declined to serve you at all, typically because you asked a server
that is not configured to answer for that zone or for your address.
The flags. Single letters, each one a fact about the conversation.
qr just marks this as a response. rd means recursion desired:
the client asked the server to chase the answer down for it. ra means
recursion available: this server is willing to do that, which tells you that you are
talking to a recursive resolver rather than a bare authority. The one to watch for is
aa, authoritative answer, which appears only when the responding server
actually owns the zone. Its presence or absence answers "am I hearing the source or an
echo?" An answer with aa is the record as published. An answer without it
came from a cache. When the flags: line shows rd but no
ra and you got a refusal or an empty answer, you asked an authoritative
server to recurse, and it correctly declined.
The question section plays the question back at you: name, class,
type. It is worth an actual glance, because it shows the name after any
munging. If a search domain got appended, this is where
shop silently became shop.internal.corp.example.com, and the
mystery is solved before you reach the answer.
The answer section is the payload, one line per record: the name, a number, the class, the type, and the data. The number is the TTL, in seconds, and it is the most operationally important column on the page. The TTL is a countdown: it is how much longer the resolver that answered you may keep serving this record from cache before it must re-fetch from upstream. Ask an authoritative server and you see the zone's configured TTL, full and constant. Ask a cache and you see the remaining time, ticking down with every repeated query until it hits zero, the cache refreshes, and the number snaps back up. This single observation explains "DNS propagation," a phrase that suggests pushing when the mechanism is purely expiry. Nothing is propagated anywhere. Old answers age out of caches, each on its own schedule, and the longest configured TTL is your worst-case wait.
The authority and additional sections appear when the server has
something to add. In a referral, the authority section lists the NS records to go ask
next, and the additional section helpfully includes their addresses; this is what
+trace output is mostly made of. In a negative answer, the authority
section carries the zone's SOA record, which is not decoration: its last field sets
how long resolvers may cache the non-existence of the name. That is negative
caching, and it has its own scenario below.
SERVER:), how long it took (Query time:), and over
which transport. Every claim in the rest of the output is a claim made by that
server. 127.0.0.53 means systemd-resolved's local stub answered you,
not the network's resolver. A query time of 0–3 msec usually means a nearby cache; tens
of milliseconds means somebody actually went and asked. An answer is only as good as
the resolver that gave it, and this line names the resolver.Three production scenarios
"Propagation" is just caches expiring
You moved shop.example.com to a new load balancer and updated the A record
an hour ago. Monitoring says some traffic is still arriving at the old IP. The team
chat says "DNS is still propagating," which is a description, not a diagnosis. The
diagnosis is three queries:
$ dig +short @ns1.dns-host.net shop.example.com 203.0.113.50 # the authority: the new IP is published $ dig +short @8.8.8.8 shop.example.com 198.51.100.7 # Google's cache: still the old IP $ dig @8.8.8.8 shop.example.com | grep -A1 'ANSWER' ;; ANSWER SECTION: shop.example.com. 1142 IN A 198.51.100.7
Now you know everything. The authority is serving the new address, so the change took; whatever the registrar UI claimed, the zone is correct. Google's resolver is still holding the old record and will hold it for another 1142 seconds, because the old record carried roughly a one-hour TTL when Google cached it. There is no force-refresh to send and no support ticket to file. The stale answers will be gone, everywhere, within one old-TTL of the change. The lesson lands earlier in the process: before a planned migration, lower the record's TTL to 60 seconds, wait out the old TTL so every cache has picked up the short one, make the change, then restore the long TTL. Caches drain in a minute instead of an hour, and "propagation" stops being weather.
One sharper edge: if the query had returned NXDOMAIN rather than a stale
address — say someone fat-fingered the record name, queries failed for ten minutes,
and then the fix went in — the non-existence itself gets cached, for the duration set
in the SOA record's last field. Users keep failing after the fix, which reads
as madness until you remember negative caching exists. dig shows you the
countdown on that too, in the authority section of the negative answer.
A CNAME chain gone wrong
The website is down, but only sort of. www.example.com fails to resolve,
yet the DNS console shows the record sitting right there. The record is a CNAME, and
the console only shows your half of the story:
$ dig www.example.com ;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 4480 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 1 ;; ANSWER SECTION: www.example.com. 300 IN CNAME web-prod-7.old-cdn-provider.net. ;; AUTHORITY SECTION: old-cdn-provider.net. 900 IN SOA ns1.old-cdn-provider.net. ...
Look at that header and answer together: NXDOMAIN, and yet
ANSWER: 1. That combination is the signature of a broken alias chain. The
resolver found your CNAME just fine, followed it to
web-prod-7.old-cdn-provider.net, and that name does not exist;
the CDN deprovisioned the endpoint when the contract lapsed, and your record now points
into the void. The overall status reflects the end of the chain, not the start, which
is why the console looked healthy and the site did not. Resolve each hop yourself
(dig web-prod-7.old-cdn-provider.net) when the chain has several links, and
remember that every link is a separate record with a separate TTL, owned by a separate
party, cached on a separate schedule. Long CNAME chains are a way of distributing your
availability across organisations that have never heard of you.
The intermittent five-second delay
Every so often, a request that normally takes 40 ms takes 5.04 seconds. Not
always; perhaps a third of the time, which is the worst kind of often. A delay of
almost exactly five seconds, intermittent, is a DNS smell so distinctive you should be
able to name it from across the room: one of the resolvers in
resolv.conf is dead, and five seconds is the default per-nameserver
timeout the C library waits before trying the next one.
$ cat /etc/resolv.conf nameserver 10.0.0.2 nameserver 10.0.0.3 $ dig @10.0.0.2 api.example.com ;; communications error to 10.0.0.2#53: timed out ;; communications error to 10.0.0.2#53: timed out ;; communications error to 10.0.0.2#53: timed out ;; no servers could be reached $ dig @10.0.0.3 api.example.com +short 203.0.113.50 # the second resolver is fine
The mechanism: the stub resolver tries nameservers in listed order. When the first one
is down, every fresh lookup burns the full timeout against the corpse before failing
over to the healthy second entry, which answers instantly. Lookups that hit a warm
local cache skip the dance entirely, which is where the intermittency comes from; only
cache misses pay the toll. The fix is whatever revives or removes the dead resolver,
plus, if you control the image, options timeout:1 attempts:2 or
options rotate in resolv.conf to shrink the blast radius next
time. dig's contribution is the proof: it isolates each nameserver and
tests it alone, something the application's resolver will never do for you. And note
what the application reported during all this: nothing. The lookup eventually
succeeded. Slow DNS hides inside "the service is slow" with no error to its name,
which is why p99 latency mysteries should meet dig early.
What dig is actually talking to
The output makes more sense once the machinery behind it is laid out flat. Resolution
involves four kinds of party. The stub resolver is the little client
inside libc (or systemd-resolved) on every machine; it does no chasing of its own, it
just forwards the question to a configured server and waits. The
recursive resolver is that server: your VPC's resolver, the home
router, 8.8.8.8. It does the real work of walking the hierarchy, and it caches
ferociously. The root and TLD servers hold no answers for your name,
only referrals: the root knows who runs .com, and the .com
servers know who runs example.com. The
authoritative servers hold the records themselves. A cold lookup
touches all four; a warm one stops at the first cache that has the answer.
Caches sit at more layers than the diagram has room for. The browser keeps one. The OS
keeps one (systemd-resolved on most modern distros, listening on
127.0.0.53). The recursive resolver keeps the big one. Some applications
and language runtimes keep their own on top, with their own ideas about expiry; the
JVM's resolver cache is famous for outliving the records it holds. Each layer honours
the TTL independently, which means a record change becomes visible at different
moments to clients sitting behind different stacks of caches. When two machines
disagree about a name, they are not disagreeing about DNS; they are sitting behind
different sets of memories.
This is also why your laptop and a Kubernetes pod can resolve the same short name
differently. The pod's /etc/resolv.conf is not yours: it points at the
cluster's DNS service and carries search domains like
namespace.svc.cluster.local, plus an ndots option that
controls when those suffixes get tried. A lookup for db in the pod becomes
db.payments.svc.cluster.local and finds a Service; the same lookup on your
laptop goes to the public DNS and dies. Same name, different question, because the
resolver configuration rewrote it before any server was consulted. The question
section of dig's output, and resolv.conf itself, are where
that rewriting becomes visible. The deeper protocol story, message format, the
hierarchy, DNSSEC, lives in
the networking codex's DNS page;
how DNS works tells it end to end as a
guide; and if you would rather watch the chain run than read about it, the
DNS resolution simulator animates
every referral and cache hit in this diagram.
Pitfalls
dig does not see what your application sees. This is the big one.
Applications resolve names through getaddrinfo(), which obeys
/etc/nsswitch.conf: typically "check /etc/hosts first, then
DNS," with mDNS or other plugins sometimes in between. dig ignores every
bit of that and goes straight to a DNS server. So when someone left a stale override in
/etc/hosts, the app faithfully connects to the wrong address while
dig swears the DNS is perfect, because it is. The reverse happens too:
dig resolves a name that the app cannot, because the app's path goes
through a misconfigured systemd-resolved while dig queried
the network resolver directly. The arbiter is getent hosts shop.example.com,
which resolves through the same libc path the application uses. When getent
and dig disagree, the difference between their two paths is the
bug, and the diagram below is the map of where to look.
The TTL you see is local truth, not global truth. A cache shows you its own remaining countdown, which says nothing about what other caches hold; the record can be fresh in Frankfurt and stale in Singapore at the same instant. And resolvers are not contractually bound to your TTL. Some clamp very low TTLs up to a floor to protect themselves, some cap very long ones, a few misbehaving ones hold records past expiry, and application-level caches sit above all of this with their own clocks. Treat the TTL as a strong default, not a guarantee, and treat "dig @the-authority" as the only answer that is not somebody's memory.
+trace answers a different question than you usually have. Because it
walks from the root itself, +trace bypasses every cache on purpose. It
shows the delegation as the world's authorities currently publish it, which is exactly
right for "did the NS change at the registrar take" and exactly wrong for "what are my
users seeing right now," since users sit behind recursive caches that
+trace never consults. The pairing to remember: +trace for
delegation truth, @resolver for cache truth. You frequently need both,
and they frequently disagree, and the disagreement is the finding.
UDP, truncation, and the occasional lie of omission. DNS prefers UDP,
and answers too large for the negotiated packet size come back truncated with the
tc flag set, at which point a correct client retries over TCP.
dig does the retry automatically and tells you (;; Truncated, retrying
in TCP mode), but some networks block TCP/53 outright, producing answers that
work for small records and fail for large ones, a failure mode that looks supernatural
until you spot the tc. dig +tcp forces the issue and turns a
suspicion into a one-line test.
A drill you can run right now
Everything below is read-only: queries against public names, nothing changed anywhere. Ten minutes, and the three ideas this page keeps circling — ask a specific resolver, read the SERVER line, watch the TTL — become things you have personally observed.
Step 1 — the baseline and the full read. Pick any site and run
dig wikipedia.org, with no options at all. Read it top to bottom against
the section decoder above: the status, each flag (you should see qr rd ra
and no aa), the question played back, the answer with its TTL, and the
footer. Say out loud which server answered. If the SERVER line shows
127.0.0.53, you now know your machine runs a systemd-resolved stub and
everything you just read came from localhost.
Step 2 — same question, different resolver. Run
dig @1.1.1.1 wikipedia.org and compare. The SERVER line changes, the
query time probably grows from near-zero to a real network round trip, and the TTL is
different, because Cloudflare's cache fetched the record at a different moment than
yours did. If the addresses themselves differ, you have caught a real propagation gap,
or a CDN giving different answers by geography, in the wild on a Tuesday.
Step 3 — walk the chain yourself. Run
dig +trace wikipedia.org and watch the referrals go by: a root server
naming the .org servers, a .org server naming wikipedia's
nameservers, and finally the answer, this time with the aa flag set
because it came from the authority itself. That is the entire global DNS hierarchy,
traversed in front of you in under a second.
Step 4 — watch a cache count down. Ask the same resolver twice, a pause apart:
$ dig wikipedia.org | grep -A1 'ANSWER' ;; ANSWER SECTION: wikipedia.org. 246 IN A 185.15.59.224 $ sleep 30; dig wikipedia.org | grep -A1 'ANSWER' ;; ANSWER SECTION: wikipedia.org. 216 IN A 185.15.59.224 $ dig @1.1.1.1 wikipedia.org +short 185.15.59.224
Thirty seconds of wall clock, thirty fewer seconds of TTL. You are watching the
resolver's cache age in real time. Keep querying and the number walks down to zero,
then jumps back to the zone's full TTL as the cache re-fetches from upstream. That
countdown is the entire mechanism behind every "propagation" delay you will ever
debug, observed directly. As a final flourish, run
getent hosts wikipedia.org and confirm the libc path agrees with
dig on this machine; the day those two disagree, you will know exactly
what kind of bug you have and which diagram to pull up.
dig @resolver name type is the
whole tool: vary the resolver to find out who is holding what, read the status and the
aa flag to learn how much to trust it, and read the TTL to learn how long
the situation will last.Further reading
- dig(1) — the manual page — the full option list; the query options section (everything starting with +) is the part worth a careful pass.
- RFC 1034 — Domain names: concepts and facilities — the original design document, and still the clearest statement of what the hierarchy and caching are for.
- RFC 2308 — Negative caching of DNS queries — why NXDOMAIN answers persist after the fix, straight from the source.
- Julia Evans — "DNS doesn't propagate" — a short, sharp dismantling of the propagation metaphor that pairs well with this page's first scenario.
- Semicolony — How DNS works — the end-to-end story this page's "underneath" section compresses.