journalctl & dmesg
A service keeps restarting and its own log file ends mid-sentence. A box rebooted at 3am and nobody asked it to. A process died and left no farewell at all. Your application logs tell you what your code thinks happened; the journal and the kernel ring buffer tell you what the operating system knows happened, including the parts your code never got to see. This page covers the five invocations that do the daily work, decodes a crash-restart cycle and the famous OOM-killer block line by line, walks three production incidents, and ends with a drill that is safe on any machine.
The question they answer
Every debugging session starts with logs, and most engineers stop at the first kind: the application's own output. That works right up until the moment it cannot. When the kernel kills a process with SIGKILL, the process gets no chance to flush a buffer, write a stack trace, or say goodbye; the signal is not catchable, and the log simply stops. When a machine loses power or panics, the application log ends wherever the last flushed write landed. When a disk starts returning errors, the application sees timeouts and retries and reports them as its problem, because it has no idea the hardware underneath is failing. In all three cases the truth was recorded — just not by your code.
Linux keeps two system-level records, and they answer slightly different questions. The
journal, kept by systemd-journald, is the userspace record: it
captures what every service printed to stdout and stderr, what systemd itself observed about
those services starting, stopping, crashing, and being restarted, plus everything sent
through the old syslog interface. You read it with journalctl. The
kernel ring buffer is the kernel's own record: hardware events, driver
messages, filesystem complaints, network interface state changes, and the verdicts of kernel
subsystems like the OOM killer. You read it with dmesg, or with
journalctl -k, because the journal ingests kernel messages too.
The shift in thinking is small but it changes what you do first during an incident. The application log answers "what was my code doing?" The journal answers "what did the service manager see happen to my process?" The kernel buffer answers "what did the machine itself experience?" A service that "randomly dies" is rarely random once you read the second and third records: there is almost always a line, written by systemd or the kernel, that states exactly who killed it and why. The skill this page teaches is finding that line quickly, and then being able to actually read it, because the most important kernel messages — the OOM block above all — are written in a dialect nobody teaches.
The five invocations that matter
Like most of the systemd surface, journalctl has a long manual and a short
working set. Five invocations cover nearly everything you will do with it and with
dmesg in a normal month of operations.
| Invocation | What it shows | When you reach for it |
|---|---|---|
journalctl -u nginx --since "1 hour ago" | One unit's log, time-windowed | The everyday move. Any "what is this service doing" question starts here. |
journalctl -fu nginx | Live follow of one unit | Watching a deploy land, tailing during an incident — tail -f for services |
journalctl -b -1 -e | The previous boot, jumped to the end | Post-crash forensics: what were the last things written before the machine went down |
journalctl -p err -b | Everything at priority error or worse, this boot | The triage sweep: one command, every complaint the system considered serious |
journalctl -k / dmesg -T | Kernel messages only, with readable timestamps | Hardware, drivers, filesystems, OOM kills — anything below userspace |
A few notes on the grammar, because each flag hides a little more than it shows. The
--since and --until filters accept both human phrases
("1 hour ago", yesterday, today) and timestamps
("2026-06-08 03:00"), and combining a unit with a window is the single highest
value habit here: an unfiltered journalctl on a long-lived box will happily
page you through months of history. The -e flag jumps the pager to the end,
which is where the interesting lines usually are; -n 200 limits output to the
last two hundred entries if you want no pager at all.
The -b selector takes an offset: -b alone means the current boot,
-b -1 the one before it, -b -2 the one before that, and
journalctl --list-boots prints the full catalogue with start and end times.
This is the flag that turns the journal from a log viewer into a forensic instrument,
because "what happened before the reboot" is otherwise a genuinely hard question.
Priority filtering with -p uses the eight syslog levels, and the filter means
"this level and worse": -p err shows err, crit, alert, and emerg. The
levels, from loudest to quietest: emerg (0), alert (1),
crit (2), err (3), warning (4), notice
(5), info (6), debug (7). In practice -p err is the
triage sweep and -p warning is the slightly paranoid version of it.
dmesg prefixes every line with
seconds since boot, like [1834502.114866], which is precise and unreadable.
dmesg -T converts to wall-clock time. The conversion has a known wrinkle —
covered in the pitfalls — but for "when did the disk start complaining," readable beats
exact, and journalctl -k sidesteps the issue entirely by stamping kernel
messages with real receipt times.Reading the output
Here is the journal around a service crash — the thing you will actually be staring at when
someone says "payments keeps restarting." Run as a user in the systemd-journal
group or with sudo; without either you only see your own session's entries.
$ sudo journalctl -u payments --since "12:00" --until "12:06" Jun 08 12:04:31 web-3 payments[1234]: request completed route=/charge status=200 dur=41ms Jun 08 12:04:32 web-3 payments[1234]: request completed route=/charge status=200 dur=39ms Jun 08 12:04:40 web-3 systemd[1]: payments.service: A process of this unit has been killed by the OOM killer. Jun 08 12:04:40 web-3 systemd[1]: payments.service: Main process exited, code=killed, status=9/KILL Jun 08 12:04:40 web-3 systemd[1]: payments.service: Failed with result 'oom-kill'. Jun 08 12:04:50 web-3 systemd[1]: payments.service: Scheduled restart job, restart counter is at 7. Jun 08 12:04:50 web-3 systemd[1]: Started payments.service - Payments API.
The line shape is fixed: timestamp, hostname, then an identifier with a PID in brackets,
then the message. The identifier is the tell. Lines tagged payments[1234] are
the service's own stdout and stderr, captured by journald — your application talking. Lines
tagged systemd[1] are the service manager talking about your service
from the outside, and those are the ones that survive a crash, because PID 1 does not die
when your process does. Interleaving the two voices in one timeline is the journal's whole
value: the application's last healthy request at 12:04:32, then eight seconds of silence,
then the outside view of its death.
Decode the death line itself. code=killed means the process did not exit; it
was terminated by a signal. status=9/KILL names the signal: 9, SIGKILL, the
uncatchable one. A clean crash from a failed assertion looks different
(code=exited, status=1); a segfault different again
(status=11/SEGV); a graceful stop that overran its timeout shows
status=9/KILL too, but only after a Stopping... line and a
State 'stop-sigterm' timed out complaint before it. The pattern above —
no stop request, straight to SIGKILL, with an explicit oom-kill result on a recent systemd —
points one direction, and the kernel buffer has the rest of the story. On older systems the
oom-kill attribution lines are absent and all you get is the bare
status=9/KILL, which is exactly when you need journalctl -k.
The OOM block, decoded
When the kernel runs out of reclaimable memory, it picks a process and kills it, and it
writes a long, dense block into the ring buffer explaining itself. Almost everyone has
scrolled past this block; very few people can read it. Here is the abbreviated shape, via
journalctl -k around the same timestamp:
$ sudo journalctl -k --since "12:04" --until "12:05" Jun 08 12:04:40 web-3 kernel: java invoked oom-killer: gfp_mask=0x140cca(GFP_HIGHUSER_MOVABLE|__GFP_COMP), order=0, oom_score_adj=0 Jun 08 12:04:40 web-3 kernel: Mem-Info: Jun 08 12:04:40 web-3 kernel: active_anon:3801504 inactive_anon:62110 isolated_anon:0 ... Jun 08 12:04:40 web-3 kernel: Tasks state (memory values in pages): Jun 08 12:04:40 web-3 kernel: [ pid ] uid tgid total_vm rss ... oom_score_adj name Jun 08 12:04:40 web-3 kernel: [ 612] 0 612 73410 1922 ... 0 rsyslogd Jun 08 12:04:40 web-3 kernel: [ 1234] 998 1234 9437184 3145728 ... 0 java Jun 08 12:04:40 web-3 kernel: Out of memory: Killed process 1234 (java) total-vm:37748736kB, anon-rss:12582912kB, file-rss:1024kB, shmem-rss:0kB, UID:998 pgtables:25600kB oom_score_adj:0
Read it top to bottom. The first line names the invoker, not the victim:
java invoked oom-killer means java happened to be the process whose memory
allocation could not be satisfied, which tripped the killer. The invoker and the victim are
often the same process on a single-tenant box, but they do not have to be — an innocent
16 MB allocation by rsyslogd can invoke the killer, which then chooses the 12 GB
java process as the victim. Blaming the invoker is the classic misreading. The
gfp_mask and order describe the allocation that failed
(order=0 is a single 4 KiB page, the most ordinary request there is — the
machine was genuinely out, not fragmented).
Then comes the task table, a census of every candidate at the moment of the
kill, and its trap is in the header: memory values in pages, not bytes or
kilobytes. One page is 4 KiB on most systems, so java's rss of 3145728
pages is 12 GiB, and its total_vm of 9437184 pages is 36 GiB. The
table is where you see the whole field: who else was big, what their
oom_score_adj was, and whether the victim was really the heaviest process or
just the heaviest one without a protective score.
Finally the verdict line, the one everyone has seen and few can parse:
The distinction that matters: total-vm is virtual address space — every mapping
the process ever made, most of it never backed by physical memory. A JVM or a Go runtime
reserves enormous virtual ranges as a matter of course, so a huge total-vm proves nothing.
anon-rss is anonymous resident memory: heap, stacks, the pages the process was
genuinely holding in RAM with no file behind them. That 12 GB is what the process was
actually costing the machine, and it is the number to put in the incident writeup.
file-rss and shmem-rss are file-backed and shared pages, which the
kernel could mostly have reclaimed without killing anything, which is why they are usually
small in these reports. The deeper machinery — what counts as reclaimable, how the oom score
is computed, why the kernel overcommits in the first place — lives in
memory management.
Three production scenarios
"The service randomly restarts"
The report arrives as a mystery: payments has restarted six times today, the application
log shows normal traffic and then a gap, the dashboards show the gaps but not the cause.
"Randomly" is doing a lot of work in that sentence. Start with the unit's journal over the
window, exactly as in the excerpt above, and look for the systemd lines between the gaps.
There are only a few endings a process can have, and each leaves a distinct signature:
code=exited, status=1 means it crashed on its own and the reason should be in
its last stdout lines; status=11/SEGV means a segfault;
code=killed, status=9/KILL with no stop request means something outside the
process killed it, and the two usual suspects are the kernel OOM killer and a cgroup memory
limit.
Confirm with the kernel record: journalctl -k --since "12:00" and look for the
OOM block. One detail worth reading carefully: a global OOM kill says
Out of memory: Killed process, while a kill caused by the service's own
MemoryMax= cgroup limit says Memory cgroup out of memory instead.
The first means the machine was out of RAM and the fix is capacity or a leak hunt; the
second means the machine was fine and the service simply hit its configured ceiling, and the
fix is the ceiling or the workload. Engineers conflate these constantly and the remediation
is different for each. For the leak hunt itself — who was growing, how fast, anonymous or
cache — the working method is in
what's eating my memory? with
the supporting numbers in free & vmstat.
The box rebooted at 3am — why?
Uptime says four hours, nobody deployed anything, and the monitoring gap matches. The
current boot's journal cannot help, because the answer is in the boot that died. This is
what -b -1 is for:
$ journalctl --list-boots | tail -3 -2 b9c1... Tue 2026-06-02 09:11:02 UTC - Thu 2026-06-04 22:40:18 UTC -1 4f7a... Thu 2026-06-04 22:41:30 UTC - Sun 2026-06-08 03:12:44 UTC 0 d20e... Sun 2026-06-08 03:14:09 UTC - Sun 2026-06-08 07:02:51 UTC $ journalctl -b -1 -e
Read the last lines of the dead boot and they tell you which kind of death it was. A clean
shutdown leaves a paper trail: Stopped target Multi-User System, units stopping
one by one, Reached target System Reboot. If you see that, something
asked for the reboot — a human, an unattended-upgrades job, a cloud provider's
maintenance event — and the same journal usually names it a page up. The other signature is
the absence of one: ordinary chatter at 03:12:44 and then nothing, no shutdown sequence, the
record simply ends. That is a hard stop — power loss, a hardware fault, a hang followed by a
watchdog reset, or a kernel panic. A panic usually does not appear in the journal at all,
for the bleak reason that the process that writes the journal to disk died with everything
else; capturing panics takes pstore or kdump, which is its own topic. But even the abrupt
ending is information: check journalctl -b -1 -k -e for hardware complaints in
the final minutes, machine-check errors, or thermal warnings, and you have either a lead or
a clean bill that points at power. What the machine does from power-on to that first journal
line is walked step by step in the Linux boot
simulator.
The disk that announced its death for weeks
Storage rarely fails without warning; it fails after weeks of warnings nobody read. The kernel logs every failed I/O against a block device, and those lines accumulate in the ring buffer and the journal long before the filesystem gives up:
$ sudo dmesg -T | grep -iE "i/o error|ata[0-9]|exception" [Tue Jun 2 04:12:09 2026] ata3.00: exception Emask 0x0 SAct 0x400 SErr 0x0 action 0x0 [Tue Jun 2 04:12:09 2026] ata3.00: failed command: READ FPDMA QUEUED [Tue Jun 2 04:12:09 2026] blk_update_request: I/O error, dev sdb, sector 488282112 [Sat Jun 6 11:38:51 2026] blk_update_request: I/O error, dev sdb, sector 488282113 [Sun Jun 8 02:55:17 2026] EXT4-fs error (device sdb1): ext4_find_entry:1463: inode #2883585: comm java: reading directory lblock 0
The progression reads like a diagnosis. An ATA exception and a failed read command is the
drive struggling with a sector; blk_update_request: I/O error is the block
layer giving up on it after retries; the same or neighbouring sector numbers recurring
across days means a growing defect, not a one-off; and the EXT4-fs error at the
end is the filesystem finally tripping over the bad region — often the first moment an
application notices anything. The useful habit is checking for the first three signatures
while they are still cheap: journalctl -k -p err --since "-7 days" as a weekly
glance, or better, shipping kernel-priority errors into the alerting pipeline so a human
never has to remember. How system logs feed that pipeline is the subject of
logs, metrics & traces.
What's underneath
The two tools make more sense once you see the plumbing they sit on. There are two stores and several inputs, and almost everything on a systemd machine flows through one daemon.
Start at the bottom of the stack. The kernel ring buffer is a fixed-size
circular buffer inside kernel memory — typically a few megabytes, set by
log_buf_len — where every printk() from every driver and subsystem
lands. Circular means it wraps: on a chatty system, messages from early boot get overwritten
by lunchtime, which is why dmesg on a long-running machine sometimes cannot
show you the boot sequence at all. It survives nothing — a reboot clears it — and it has no
notion of units, users, or priorities beyond the kernel log levels. dmesg is a
thin window onto exactly this buffer and nothing else.
One layer up, journald is the collector. It reads the kernel buffer through
/dev/kmsg, so kernel messages end up in the journal too. It owns the stdout and
stderr of every systemd unit — when systemd starts a service, it wires the process's file
descriptors 1 and 2 to a socket journald listens on, which is why services on modern systems
just print to stdout and let the system handle the rest. And it listens on
/dev/log, the socket behind the ancient syslog() API, so software
written decades before systemd flows in as well. For every entry, from any source, journald
records not just the message but a set of trusted fields the sender cannot fake:
_PID, _UID, _SYSTEMD_UNIT, _BOOT_ID,
PRIORITY, the timestamp of receipt. The store is a binary, indexed format —
which is what makes journalctl -u nginx --since "1 hour ago" an indexed lookup
rather than a scan through flat text files, and what makes -b -1 possible at
all, since every entry carries the boot it belongs to.
Where the journal lives decides whether it survives a reboot, and this is configuration, not
fate. With Storage=volatile the journal sits in /run/log/journal,
a tmpfs, and vanishes with the boot. With Storage=persistent it lives in
/var/log/journal and accumulates across boots. The common default,
Storage=auto, persists only if the /var/log/journal directory
already exists — a rule with consequences covered in the pitfalls. On fleets, journald is
usually the first hop rather than the destination: it forwards to rsyslog or a shipper, and
the entries become part of the centralised pipeline described in
logs, metrics & traces.
The journal is the ground truth on the box; the pipeline is how anyone finds it without
SSHing to the box.
Pitfalls
Assuming the journal survived the reboot. The cruellest discovery in a
post-crash investigation: you run journalctl -b -1 and get
Specifying boot ID or boot offset has no effect, no persistent journal was found.
On distros where /var/log/journal does not exist out of the box,
Storage=auto quietly means volatile, and every reboot shreds the evidence. Check
now, on a calm day: journalctl --list-boots showing only the current boot is the
tell. The fix is one line in /etc/systemd/journald.conf
(Storage=persistent) or simply creating the directory, then restarting journald.
Do it before the incident, because afterwards is too late by definition.
Forgetting the journal eats itself. Persistence is not forever. journald
caps its disk usage — by default a percentage of the filesystem, tunable with
SystemMaxUse= — and vacuums the oldest entries when it hits the cap. On a chatty
box "persistent" can mean ten days, and the boot you wanted from last month is gone. Check
what you actually have with journalctl --disk-usage and the timestamps in
--list-boots; trim deliberately with --vacuum-time=30d or
--vacuum-size=2G rather than letting the default decide which evidence to keep.
Trusting dmesg timestamps too much. Raw dmesg stamps lines in
seconds since boot. dmesg -T converts to wall-clock by adding the boot time —
but the kernel clock those stamps come from does not tick while the machine is suspended, so
on laptops and suspend-happy VMs the converted times drift by exactly the total suspended
time, sometimes hours. On a server that never sleeps, -T is fine. When the exact
time matters, prefer journalctl -k: journald stamps each kernel message with the
real time it received it.
Grepping the journal instead of querying it. Piping
journalctl | grep nginx works, but it forces a render of the entire journal just
to throw most of it away, and it matches the text nginx anywhere — including some
other service complaining about nginx. The journal is a database with indexed fields; ask it
like one. -u nginx matches the unit field, _PID=1234 follows one
process, -p err filters by priority, --grep searches message text
while keeping the other filters cheap, and -o json emits the full field set when
a script is the consumer. The field-based query is faster, and more importantly it is
precise: -u cannot be fooled by a coincidental substring.
A drill you can run right now
Everything below reads state and writes nothing. Ten minutes on any Linux machine — a server, a VM, a Raspberry Pi — and the three records this page is about stop being abstract: you will have read a unit's journal, swept a boot for complaints, and looked at the kernel's raw record with both timestamp formats.
Step 1 — one unit, one window. Run
journalctl -u ssh --since today (the unit is sshd on Fedora-family
systems; systemctl list-units --type=service shows what your box calls things).
Read the line shape against the anatomy above: timestamp, host, identifier with PID, message.
Find a line written by the service itself and, if the service has restarted recently, the
systemd[1] lines around it — the two voices in one timeline. If the output is
empty, that is a finding too: either nothing connected today, or you are seeing the
permission pitfall and need sudo.
Step 2 — the triage sweep. Run journalctl -b -p warning —
every entry of the current boot that the system filed at warning or worse. On a healthy
machine this is short and oddly interesting: a service that took two tries to start, a
firmware grumble, a misconfigured timer. Pick one entry and pull its context with
journalctl -u that-unit -e. Then run journalctl --list-boots and
journalctl --disk-usage and note what you actually have: how many boots of
history, how much disk. If you see exactly one boot, you have found the persistence pitfall
on your own machine, on a calm day, which is the cheapest possible way to find it.
Step 3 — the kernel's record, both clocks. Finish with the ring buffer:
$ sudo dmesg | tail -5 [1834502.114866] usb 1-3: new high-speed USB device number 9 using xhci_hcd [1834502.263531] usb 1-3: New USB device found, idVendor=0951, idProduct=1666 $ sudo dmesg -T | tail -5 [Sun Jun 8 14:02:11 2026] usb 1-3: new high-speed USB device number 9 using xhci_hcd [Sun Jun 8 14:02:11 2026] usb 1-3: New USB device found, idVendor=0951, idProduct=1666 $ journalctl -k -e -n 5
The same events three ways: raw seconds-since-boot, -T's wall-clock conversion,
and the journal's own receipt timestamps. Compare the last two — on a machine that never
suspends they agree; on a laptop they may not, and now you know why. If plain
dmesg refuses without root, you have met kernel.dmesg_restrict,
a hardening default on many distros, and sudo is the answer. Read the five
lines you got, whatever they are: a USB device, a network interface flapping, a filesystem
mount. Each one is the kernel narrating its own life, in the same voice it will use on the
day it writes an OOM block or an I/O error with your pager on the other end.
journalctl -u SERVICE --since "1 hour ago" for what a service and its manager
saw, journalctl -b -1 -e for what the machine said before it went down, and
dmesg -T | tail for what the kernel is saying right now.Further reading
- journalctl(1) — the manual page — the filtering section repays a careful read; most of the flags you will ever need are in the first third.
- systemd.journal-fields(7) — every field journald records per entry, including the trusted underscore-prefixed ones senders cannot fake.
- dmesg(1) — short, and the notes on timestamp conversion document the suspend drift in the kernel's own words.
- Semicolony — Logs, metrics & traces — where the journal fits once you have more than one machine and the question becomes fleet-wide.