Unsafe, raw pointers, UB
unsafe is not "turn the checks off". It unlocks exactly five operations, leaves every other rule running, and transfers one obligation from the compiler to you: uphold the invariants it can no longer verify. Safe Rust exists because someone wrote this layer correctly — Vec, Mutex, every FFI binding. This page covers what the keyword really permits, the catalogue of undefined behaviour, the aliasing models (Stacked and Tree Borrows) that define what pointers may do, and Miri, the interpreter that catches you when you're wrong.
Long read · the five superpowers through aliasing models, FFI, and Miri · references at the end
1 · The five superpowers — and nothing else
Inside an unsafe block you may do exactly five things that safe Rust forbids:
- Dereference a raw pointer (
*const T/*mut T). - Call an
unsafe fn— including everyexternFFI function. - Access or modify a mutable
static. - Implement an
unsafe trait(Send,Sync,GlobalAlloc…). - Read a field of a
union.
Everything else still applies. The borrow checker still runs on every reference; moves
are still tracked; types still check. The keyword's real meaning is a contract marker:
unsafe fn says "I have preconditions the type system can't express — caller
must read the docs"; an unsafe {} block says "I checked them".
Since the 2024 edition the two are properly separated: the body of an
unsafe fn no longer acts as one big implicit block, and each unsafe
operation inside needs its own unsafe {}
(unsafe_op_in_unsafe_fn).
/// Returns the element without bounds checking.
///
/// # Safety
/// `i` must be < `self.len()`.
pub unsafe fn get_unchecked(&self, i: usize) -> &T {
// SAFETY: caller promises i < len, so the pointer stays in-bounds
// of one allocation and points at an initialised T.
unsafe { &*self.ptr.add(i) }
}
// Convention: every unsafe fn documents "# Safety"; every unsafe block
// carries a "// SAFETY:" comment discharging those obligations.
// clippy::undocumented_unsafe_blocks enforces the latter.2 · Raw pointers — what they are and aren't
*const T and *mut T are addresses with a type attached, and
that's all: no lifetime, no aliasing claim, no non-null or alignment guarantee, allowed
to dangle. Creating one is safe; only the dereference is gated. The vocabulary
around them:
fn main() {
let mut x = 42u64;
let p: *const u64 = &x; // coercion from a reference
let q: *mut u64 = &raw mut x; // &raw (1.82+): no intermediate
// reference is ever created —
// important for packed fields
unsafe {
*q += 1; // deref: unsafe
println!("{}", *p); // 43
println!("{}", p.read()); // ptr::read — bitwise copy out
}
// Arithmetic is in units of T, and must stay within one allocation:
let arr = [1i32, 2, 3];
let base = arr.as_ptr();
unsafe { println!("{}", *base.add(2)); } // 3
// NonNull<T>: a *mut T that's never null, covariant, and gives
// Option<NonNull<T>> the niche — what Vec/Box/Rc use internally.
}Two non-obvious rules. Pointer arithmetic (add/offset) is
itself UB if it leaves the allocation it started in (one-past-the-end is the only
exception) — the operation, not the later deref. And round-tripping pointers through
integers erases provenance information the optimiser tracks; the strict-provenance APIs
(addr(), with_addr(), stabilised in 1.84) exist so code can be
explicit about it.
3 · The UB catalogue
Undefined behaviour is not "crashes" — it's "the optimiser may assume this never
happens", which means miscompilation that can surface anywhere, later, or only at
-O. The Reference's list of things unsafe code must never do, abridged to
what bites in practice:
- Data races — unsynchronised conflicting access from two threads.
- Dereferencing dangling or misaligned pointers, or out-of-bounds arithmetic.
- Breaking the aliasing rules — e.g. a
&mutthat aliases anything else live, or writing through a path derived from&T(outsideUnsafeCell). - Producing an invalid value, even transiently: a
boolthat's 3, acharabove 0x10FFFF, an enum with no matching discriminant, a null/dangling reference orBox, an uninitialised integer. "Producing" includes a meremem::transmuteor an over-eagerassume_init. - Unwinding across an ABI that doesn't support it — panicking out of an
extern "C"fn (useextern "C-unwind", stable 1.71, when unwinding must cross).
use std::mem::MaybeUninit;
// Instant UB — an uninitialised bool "exists" the moment this returns:
let b: bool = unsafe { MaybeUninit::uninit().assume_init() };
// The old std::mem::uninitialized() did exactly this and is deprecated.
// Correct staged initialisation:
let mut slot: MaybeUninit<[u64; 64]> = MaybeUninit::uninit();
let p = slot.as_mut_ptr() as *mut u64;
for i in 0..64 { unsafe { p.add(i).write(i as u64) } }
let arr: [u64; 64] = unsafe { slot.assume_init() }; // now actually init4 · The aliasing models: Stacked Borrows → Tree Borrows
Safe Rust's references promise non-aliasing, and rustc tells LLVM so
(noalias on &mut), unlocking optimisations C needs
restrict for. But what exactly may raw pointers mixed with
references do before the promise breaks? The language spec doesn't fully say yet;
the working answers are Ralf Jung's research models, implemented in Miri.
Stacked Borrows (2018) gives each allocation a stack of permissions.
Creating &mut from a raw pointer pushes a tag; using an older tag pops
everything above it — so "use parent pointer, then use child reference again" is the
signature violation. Tree Borrows (2023, Villani & Jung) replaces
the stack with a tree and state machine per node; it's more permissive where Stacked
Borrows rejected reasonable patterns (two-phase-borrow-like code, some
as_mut_ptr idioms) while keeping the optimisations sound.
fn main() {
let mut x = 0u32;
let raw = &mut x as *mut u32; // parent: raw pointer
let r = unsafe { &mut *raw }; // child: reference derived from it
unsafe { *raw = 1; } // write through the PARENT...
*r = 2; // ...then use the child again: UB
}
// $ cargo +nightly miri run
// error: Undefined Behavior: attempting a write access using <tag> ...
// but that tag does not exist in the borrow stack for this location
// (Under Tree Borrows: the child's permission was invalidated by the
// foreign write. Same verdict here, friendlier model elsewhere.)The practical rules that keep you inside both models: derive all pointers you'll use
concurrently from the same source; don't keep using a reference after writing
through an ancestor pointer; never materialise a &/&mut
you don't strictly need (&raw exists for exactly this); and keep raw
pointers raw through the hot section, converting to references only at the edges.
5 · Where std itself relies on unsafe
The standard library is the proof that the model works: a safe interface over a core
that's unsafe for performance or expressiveness reasons, with the invariants guarded by
module privacy. The classic specimen is split_at_mut — safe, sound, and
impossible to write in safe Rust because the borrow checker can't see that two halves of
one slice don't overlap:
pub fn split_at_mut(&mut self, mid: usize) -> (&mut [T], &mut [T]) {
assert!(mid <= self.len()); // the safety condition, checked
let len = self.len();
let ptr = self.as_mut_ptr();
unsafe {
(
from_raw_parts_mut(ptr, mid),
from_raw_parts_mut(ptr.add(mid), len - mid),
)
}
// SAFETY: [0, mid) and [mid, len) are disjoint, so the two &mut
// never alias — a fact about arithmetic, not about types.
}Same pattern throughout: Vec manages raw capacity and uses
set_len only after elements are written; String skips UTF-8
validation internally where it's already proven; channels and Arc do their
atomics dance. The crucial point about the soundness boundary: it is the module,
not the block. Vec's unsafe code is correct only because len
and cap are private — any safe code inside the module that could
set len wrong silently makes the unsafe blocks unsound. When you audit a
crate, you audit everything that can reach the invariants, not just the lines marked
unsafe.
6 · FFI — the unavoidable unsafe
Every C call is unsafe by definition: the compiler can't see across the boundary. The load-bearing pieces:
use std::ffi::{c_char, c_int, CStr, CString};
#[repr(C)] // layout contract — see the layout page
pub struct Config { pub verbosity: c_int, pub name: *const c_char }
extern "C" { // names + signatures, trusted blindly
fn lib_init(cfg: *const Config) -> c_int;
fn lib_last_error() -> *const c_char;
}
pub fn init(name: &str, verbosity: i32) -> Result<(), String> {
let cname = CString::new(name).map_err(|e| e.to_string())?;
let cfg = Config { verbosity, name: cname.as_ptr() };
// SAFETY: cfg outlives the call; cname outlives cfg.name's use;
// lib_init documents no other preconditions.
let rc = unsafe { lib_init(&cfg) };
if rc == 0 { return Ok(()); }
// SAFETY: lib_last_error returns a valid NUL-terminated string
// owned by the library (per its docs).
Err(unsafe { CStr::from_ptr(lib_last_error()) }.to_string_lossy().into_owned())
}- A wrong
externsignature is silent UB — nothing cross-checks it against the C header.bindgengenerates them from the header for this reason. - Keep ownership unambiguous per pointer: who allocates, who frees, with which allocator. Freeing a Rust
Boxwith C'sfree()is UB. - Don't let panics cross a plain
"C"boundary — catch withstd::panic::catch_unwindin callbacks you hand to C, or useextern "C-unwind"deliberately.
7 · Miri — the UB test harness
Miri is an interpreter for MIR that executes your tests while checking every
memory access against the rules above: bounds, initialisation, alignment, validity
invariants, leaks, data races, and the aliasing model (Stacked Borrows by default,
-Zmiri-tree-borrows for the newer one). It's the closest thing Rust has to a
UB oracle, and it has found real bugs in std itself.
$ rustup +nightly component add miri
$ cargo +nightly miri test # run the test suite interpreted
$ cargo +nightly miri run # or a binary
# Typical catch:
error: Undefined Behavior: out-of-bounds pointer arithmetic:
expected a pointer to 4 bytes of memory, but got alloc1234
which is only 2 bytes from the end of the allocation
--> src/lib.rs:31:18
= help: this indicates a bug in the program
# Limits to know: ~50-100x slower than native; no real FFI (extern
# calls to C are unsupported unless shimmed); only explores the
# schedules/paths your tests actually take. Complement, not proof.The working discipline for any crate with unsafe in it: #![forbid(unsafe_code)]
in crates that don't need it at all; SAFETY comments and # Safety docs where
it is needed; Miri in CI; and cargo-geiger or cargo vet when
you want to know how much of your dependency tree you're trusting.
References
- The Rustonomicon — the book on unsafe Rust; required reading before writing any.
- The Reference: behavior considered undefined — the normative UB list.
- Ralf Jung — Stacked Borrows and Tree Borrows — the aliasing models.
- rust-lang/miri — the interpreter, flags, and what it can/can't detect.
- Unsafe Code Guidelines — the WG hashing out what's officially promised.
- std::ptr — provenance, validity, and the raw-pointer API contracts.