Internals · 10 / 11
Internals

Unsafe, raw pointers, UB

unsafe is not "turn the checks off". It unlocks exactly five operations, leaves every other rule running, and transfers one obligation from the compiler to you: uphold the invariants it can no longer verify. Safe Rust exists because someone wrote this layer correctly — Vec, Mutex, every FFI binding. This page covers what the keyword really permits, the catalogue of undefined behaviour, the aliasing models (Stacked and Tree Borrows) that define what pointers may do, and Miri, the interpreter that catches you when you're wrong.

Long read · the five superpowers through aliasing models, FFI, and Miri · references at the end


1 · The five superpowers — and nothing else

Inside an unsafe block you may do exactly five things that safe Rust forbids:

  • Dereference a raw pointer (*const T / *mut T).
  • Call an unsafe fn — including every extern FFI function.
  • Access or modify a mutable static.
  • Implement an unsafe trait (Send, Sync, GlobalAlloc…).
  • Read a field of a union.

Everything else still applies. The borrow checker still runs on every reference; moves are still tracked; types still check. The keyword's real meaning is a contract marker: unsafe fn says "I have preconditions the type system can't express — caller must read the docs"; an unsafe {} block says "I checked them". Since the 2024 edition the two are properly separated: the body of an unsafe fn no longer acts as one big implicit block, and each unsafe operation inside needs its own unsafe {} (unsafe_op_in_unsafe_fn).

rust src/lib.rs · the discipline in miniature
/// Returns the element without bounds checking.
///
/// # Safety
/// `i` must be < `self.len()`.
pub unsafe fn get_unchecked(&self, i: usize) -> &T {
    // SAFETY: caller promises i < len, so the pointer stays in-bounds
    // of one allocation and points at an initialised T.
    unsafe { &*self.ptr.add(i) }
}
// Convention: every unsafe fn documents "# Safety"; every unsafe block
// carries a "// SAFETY:" comment discharging those obligations.
// clippy::undocumented_unsafe_blocks enforces the latter.

2 · Raw pointers — what they are and aren't

*const T and *mut T are addresses with a type attached, and that's all: no lifetime, no aliasing claim, no non-null or alignment guarantee, allowed to dangle. Creating one is safe; only the dereference is gated. The vocabulary around them:

rust src/main.rs · the raw pointer toolkit
fn main() {
    let mut x = 42u64;

    let p: *const u64 = &x;            // coercion from a reference
    let q: *mut u64 = &raw mut x;      // &raw (1.82+): no intermediate
                                       // reference is ever created —
                                       // important for packed fields
    unsafe {
        *q += 1;                        // deref: unsafe
        println!("{}", *p);             // 43
        println!("{}", p.read());       // ptr::read — bitwise copy out
    }

    // Arithmetic is in units of T, and must stay within one allocation:
    let arr = [1i32, 2, 3];
    let base = arr.as_ptr();
    unsafe { println!("{}", *base.add(2)); }  // 3

    // NonNull<T>: a *mut T that's never null, covariant, and gives
    // Option<NonNull<T>> the niche — what Vec/Box/Rc use internally.
}

Two non-obvious rules. Pointer arithmetic (add/offset) is itself UB if it leaves the allocation it started in (one-past-the-end is the only exception) — the operation, not the later deref. And round-tripping pointers through integers erases provenance information the optimiser tracks; the strict-provenance APIs (addr(), with_addr(), stabilised in 1.84) exist so code can be explicit about it.

3 · The UB catalogue

Undefined behaviour is not "crashes" — it's "the optimiser may assume this never happens", which means miscompilation that can surface anywhere, later, or only at -O. The Reference's list of things unsafe code must never do, abridged to what bites in practice:

  • Data races — unsynchronised conflicting access from two threads.
  • Dereferencing dangling or misaligned pointers, or out-of-bounds arithmetic.
  • Breaking the aliasing rules — e.g. a &mut that aliases anything else live, or writing through a path derived from &T (outside UnsafeCell).
  • Producing an invalid value, even transiently: a bool that's 3, a char above 0x10FFFF, an enum with no matching discriminant, a null/dangling reference or Box, an uninitialised integer. "Producing" includes a mere mem::transmute or an over-eager assume_init.
  • Unwinding across an ABI that doesn't support it — panicking out of an extern "C" fn (use extern "C-unwind", stable 1.71, when unwinding must cross).
There is no "it worked when I ran it". UB is a property of the program, not of an execution. The canonical career-saving habit: treat "passes tests" as zero evidence for unsafe code, and "passes Miri" as the actual bar.
rust src/main.rs · UB without a single pointer
use std::mem::MaybeUninit;

// Instant UB — an uninitialised bool "exists" the moment this returns:
let b: bool = unsafe { MaybeUninit::uninit().assume_init() };

// The old std::mem::uninitialized() did exactly this and is deprecated.
// Correct staged initialisation:
let mut slot: MaybeUninit<[u64; 64]> = MaybeUninit::uninit();
let p = slot.as_mut_ptr() as *mut u64;
for i in 0..64 { unsafe { p.add(i).write(i as u64) } }
let arr: [u64; 64] = unsafe { slot.assume_init() };  // now actually init

4 · The aliasing models: Stacked Borrows → Tree Borrows

Safe Rust's references promise non-aliasing, and rustc tells LLVM so (noalias on &mut), unlocking optimisations C needs restrict for. But what exactly may raw pointers mixed with references do before the promise breaks? The language spec doesn't fully say yet; the working answers are Ralf Jung's research models, implemented in Miri.

Stacked Borrows (2018) gives each allocation a stack of permissions. Creating &mut from a raw pointer pushes a tag; using an older tag pops everything above it — so "use parent pointer, then use child reference again" is the signature violation. Tree Borrows (2023, Villani & Jung) replaces the stack with a tree and state machine per node; it's more permissive where Stacked Borrows rejected reasonable patterns (two-phase-borrow-like code, some as_mut_ptr idioms) while keeping the optimisations sound.

rust src/main.rs · the violation Miri flags
fn main() {
    let mut x = 0u32;
    let raw = &mut x as *mut u32;       // parent: raw pointer
    let r = unsafe { &mut *raw };       // child: reference derived from it
    unsafe { *raw = 1; }                // write through the PARENT...
    *r = 2;                             // ...then use the child again: UB
}
// $ cargo +nightly miri run
// error: Undefined Behavior: attempting a write access using <tag> ...
//        but that tag does not exist in the borrow stack for this location
// (Under Tree Borrows: the child's permission was invalidated by the
//  foreign write. Same verdict here, friendlier model elsewhere.)

The practical rules that keep you inside both models: derive all pointers you'll use concurrently from the same source; don't keep using a reference after writing through an ancestor pointer; never materialise a &/&mut you don't strictly need (&raw exists for exactly this); and keep raw pointers raw through the hot section, converting to references only at the edges.

5 · Where std itself relies on unsafe

The standard library is the proof that the model works: a safe interface over a core that's unsafe for performance or expressiveness reasons, with the invariants guarded by module privacy. The classic specimen is split_at_mut — safe, sound, and impossible to write in safe Rust because the borrow checker can't see that two halves of one slice don't overlap:

rust core/src/slice · two &mut into one slice, soundly
pub fn split_at_mut(&mut self, mid: usize) -> (&mut [T], &mut [T]) {
    assert!(mid <= self.len());          // the safety condition, checked
    let len = self.len();
    let ptr = self.as_mut_ptr();
    unsafe {
        (
            from_raw_parts_mut(ptr, mid),
            from_raw_parts_mut(ptr.add(mid), len - mid),
        )
    }
    // SAFETY: [0, mid) and [mid, len) are disjoint, so the two &mut
    // never alias — a fact about arithmetic, not about types.
}

Same pattern throughout: Vec manages raw capacity and uses set_len only after elements are written; String skips UTF-8 validation internally where it's already proven; channels and Arc do their atomics dance. The crucial point about the soundness boundary: it is the module, not the block. Vec's unsafe code is correct only because len and cap are private — any safe code inside the module that could set len wrong silently makes the unsafe blocks unsound. When you audit a crate, you audit everything that can reach the invariants, not just the lines marked unsafe.

6 · FFI — the unavoidable unsafe

Every C call is unsafe by definition: the compiler can't see across the boundary. The load-bearing pieces:

rust src/ffi.rs · a binding, done by the rules
use std::ffi::{c_char, c_int, CStr, CString};

#[repr(C)]                       // layout contract — see the layout page
pub struct Config { pub verbosity: c_int, pub name: *const c_char }

extern "C" {                     // names + signatures, trusted blindly
    fn lib_init(cfg: *const Config) -> c_int;
    fn lib_last_error() -> *const c_char;
}

pub fn init(name: &str, verbosity: i32) -> Result<(), String> {
    let cname = CString::new(name).map_err(|e| e.to_string())?;
    let cfg = Config { verbosity, name: cname.as_ptr() };
    // SAFETY: cfg outlives the call; cname outlives cfg.name's use;
    // lib_init documents no other preconditions.
    let rc = unsafe { lib_init(&cfg) };
    if rc == 0 { return Ok(()); }
    // SAFETY: lib_last_error returns a valid NUL-terminated string
    // owned by the library (per its docs).
    Err(unsafe { CStr::from_ptr(lib_last_error()) }.to_string_lossy().into_owned())
}
  • A wrong extern signature is silent UB — nothing cross-checks it against the C header. bindgen generates them from the header for this reason.
  • Keep ownership unambiguous per pointer: who allocates, who frees, with which allocator. Freeing a Rust Box with C's free() is UB.
  • Don't let panics cross a plain "C" boundary — catch with std::panic::catch_unwind in callbacks you hand to C, or use extern "C-unwind" deliberately.

7 · Miri — the UB test harness

Miri is an interpreter for MIR that executes your tests while checking every memory access against the rules above: bounds, initialisation, alignment, validity invariants, leaks, data races, and the aliasing model (Stacked Borrows by default, -Zmiri-tree-borrows for the newer one). It's the closest thing Rust has to a UB oracle, and it has found real bugs in std itself.

shell terminal · the workflow
$ rustup +nightly component add miri
$ cargo +nightly miri test            # run the test suite interpreted
$ cargo +nightly miri run             # or a binary

# Typical catch:
error: Undefined Behavior: out-of-bounds pointer arithmetic:
       expected a pointer to 4 bytes of memory, but got alloc1234
       which is only 2 bytes from the end of the allocation
   --> src/lib.rs:31:18
   = help: this indicates a bug in the program

# Limits to know: ~50-100x slower than native; no real FFI (extern
# calls to C are unsupported unless shimmed); only explores the
# schedules/paths your tests actually take. Complement, not proof.

The working discipline for any crate with unsafe in it: #![forbid(unsafe_code)] in crates that don't need it at all; SAFETY comments and # Safety docs where it is needed; Miri in CI; and cargo-geiger or cargo vet when you want to know how much of your dependency tree you're trusting.

References

Found this useful?