The allocator API
Every Box::new, every Vec::push past capacity, bottoms out in one trait with two required methods. This page covers GlobalAlloc and Layout, how #[global_allocator] swaps in jemalloc or mimalloc with five lines, what happens on OOM (and the fallible alternative), the still-unstable Allocator trait behind Vec::new_in, and why embedded and high-performance code treats the allocator as a first-class design decision.
Long read · GlobalAlloc and Layout through allocator_api, arenas, and no_std · references at the end
1 · The trait at the bottom
pub unsafe trait GlobalAlloc {
unsafe fn alloc(&self, layout: Layout) -> *mut u8; // required
unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout); // required
unsafe fn alloc_zeroed(&self, layout: Layout) -> *mut u8 { ... } // default:
unsafe fn realloc(&self, ptr: *mut u8, layout: Layout, // alloc+memset /
new_size: usize) -> *mut u8 { ... } // alloc+copy+dealloc
}Note the asymmetry with C: dealloc receives the Layout back.
free() has to look size metadata up from a header next to the allocation;
Rust callers (Vec, Box) statically know what they allocated and pass it in, so an
allocator may run headerless and route by size class with no lookup. The trait
is unsafe on both sides of the contract: implementors must return
well-aligned, exclusive memory or null; callers must pass the same layout to
dealloc that they got the pointer with.
Layout is the two numbers every allocation needs — size and a power-of-two
alignment — computed for any type with Layout::new::<T>(), or built by
hand for dynamic structures (array::<T>(n),
from_size_align, and extend for header-plus-payload layouts,
which handles the padding arithmetic the
layout page describes).
2 · The default, and a short history
Unless told otherwise, a Rust binary uses std::alloc::System: malloc/free on
Unix, HeapAlloc on Windows (with aligned variants where the layout demands
it). It wasn't always so — early Rust shipped jemalloc as the default for executables,
which made binaries bigger and surprised C interop; rustc 1.32 (January 2019) switched
the default to the system allocator and left jemalloc as an opt-in. rustc itself still
links jemalloc for its own use, which says something about both sides of the trade.
Why opt out of the system allocator? Long-running multi-threaded servers are the usual case: allocators differ in per-thread caching, fragmentation behaviour over days of uptime, and contention under parallel load. jemalloc brings arena-per-CPU design and best-in-class introspection/profiling; mimalloc is small and consistently fast; glibc malloc is fine until it isn't (its arena behaviour under many threads is a known RSS amplifier). Measured swaps of 5–20% throughput on allocation-heavy services are routine, which is a lot of win for five lines:
// Cargo.toml: tikv-jemallocator = "0.6" (or mimalloc = "0.1")
use tikv_jemallocator::Jemalloc;
#[global_allocator]
static GLOBAL: Jemalloc = Jemalloc;
fn main() {
// every Box, Vec, String, HashMap in the whole program — including
// all dependencies — now allocates through jemalloc.
}One binary, one global allocator: the attribute may appear once in the crate graph, the
pick happens at link time, and there is no per-call dispatch cost — calls compile to
direct calls into the chosen implementation via the __rust_alloc symbols.
3 · Writing one — the counting wrapper
Implementing GlobalAlloc is rarely about writing malloc from scratch;
the production-grade pattern is the wrapper — instrument the real allocator:
use std::alloc::{GlobalAlloc, Layout, System};
use std::sync::atomic::{AtomicUsize, Ordering::Relaxed};
pub struct Meter;
pub static LIVE: AtomicUsize = AtomicUsize::new(0);
pub static PEAK: AtomicUsize = AtomicUsize::new(0);
unsafe impl GlobalAlloc for Meter {
unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
let p = unsafe { System.alloc(layout) };
if !p.is_null() {
let now = LIVE.fetch_add(layout.size(), Relaxed) + layout.size();
PEAK.fetch_max(now, Relaxed);
}
p
}
unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) {
LIVE.fetch_sub(layout.size(), Relaxed);
unsafe { System.dealloc(ptr, layout) }
}
}
#[global_allocator]
static A: Meter = Meter;
// Same shape powers: per-request allocation budgets, alloc-count
// assertions in benchmarks ("this hot path allocates zero times"),
// and leak hunting without external tooling.println!, no format!, no panicking paths that build
messages) — that recurses straight back into alloc and overflows the stack.
Atomics and raw syscalls only.4 · OOM: abort by default, fallible by request
When alloc returns null, the std containers call
handle_alloc_error(layout), which aborts the process — not
a panic, no unwinding, no destructors. The reasoning: by the time allocation fails,
running recovery code (which itself allocates) rarely goes well, and on
overcommitting Linux you often get the OOM killer before you ever see a null. For the
cases that genuinely must survive allocation failure — databases honouring memory
budgets, kernels, anything on a small device — std grew fallible entry points:
use std::collections::TryReserveError;
fn load(n: usize) -> Result<Vec<u64>, TryReserveError> {
let mut v: Vec<u64> = Vec::new();
v.try_reserve_exact(n)?; // Err(..) instead of abort
v.extend((0..n as u64));
Ok(v)
}
fn main() {
match load(usize::MAX / 16) {
Ok(v) => println!("loaded {}", v.len()),
Err(e) => println!("backpressure instead of death: {e}"),
}
}This is the honest state of fallible allocation on stable: try_reserve /
try_reserve_exact on Vec, String,
HashMap and friends, plus Box::try_new behind the unstable
allocator feature. Linux-kernel Rust, which forbids infallible allocation outright,
builds on its own variants of the alloc crate for exactly this reason.
5 · allocator_api — per-container allocators, still unstable
GlobalAlloc is one allocator per process. The richer design — pass an
allocator per container — has lived on nightly for years as the
allocator_api feature (tracking issue #32838):
pub unsafe trait Allocator {
fn allocate(&self, layout: Layout) -> Result<NonNull<[u8]>, AllocError>;
unsafe fn deallocate(&self, ptr: NonNull<u8>, layout: Layout);
// + grow / shrink / by-ref combinators, all fallible by design
}
// The collections gained a defaulted allocator parameter:
// pub struct Vec<T, A: Allocator = Global> { ... }
#![feature(allocator_api)]
use std::alloc::Global;
let v: Vec<u8, Global> = Vec::new_in(Global);
let b = Box::new_in(42u64, Global);
// ...and with an arena allocator A, Vec::new_in(arena) puts the
// elements in the arena — freed all at once when the arena drops.Differences from GlobalAlloc worth noticing: allocate returns
Result<NonNull<[u8]>, AllocError> — fallibility and the actual
(possibly larger) size are in the signature, not bolted on; and allocators are passed by
value/reference as ordinary generic parameters, so a Vec<T, &Bump>
borrows its arena. Why it's still unstable after a decade: the type parameter infects
every API that touches collections, and questions like "what does
Box<T, A>::into_raw mean across allocators" and how this interacts
with dyn and async traits keep reopening. On stable, the
allocator-api2 crate mirrors the trait (used by hashbrown), and
arena crates ship their own handles:
// bumpalo: bump-pointer arena. Allocation = pointer increment + bounds
// check. No per-object free; everything dies with the Bump.
use bumpalo::Bump;
use bumpalo::collections::Vec as BumpVec;
let arena = Bump::new();
let mut spans = BumpVec::new_in(&arena);
for i in 0..1000u32 {
spans.push(arena.alloc(format!("span-{i}")) as &String);
}
drop(arena); // one deallocation for a thousand objects
// Compilers, parsers, request handlers with per-request arenas:
// this is the pattern. rustc's own type interner works this way.6 · no_std and embedded — bringing your own heap
On bare metal there is no system allocator, but the machinery above is exactly how you
get one. core never allocates; the alloc crate (Box, Vec,
String, BTreeMap…) works anywhere you provide a #[global_allocator]:
#![no_std]
#![no_main]
extern crate alloc;
use embedded_alloc::LlffHeap as Heap; // linked-list first-fit
#[global_allocator]
static HEAP: Heap = Heap::empty();
#[cortex_m_rt::entry]
fn main() -> ! {
// Carve the heap out of RAM once, at boot:
use core::mem::MaybeUninit;
const SIZE: usize = 16 * 1024;
static mut MEM: [MaybeUninit<u8>; SIZE] = [MaybeUninit::uninit(); SIZE];
unsafe { HEAP.init(&raw mut MEM as usize, SIZE) }
let mut log: alloc::vec::Vec<u32> = alloc::vec::Vec::new();
log.push(0xC0FFEE);
loop {}
}The embedded discipline that follows: allocate at startup, then stop — steady-state
allocation on a 16 KiB heap means fragmentation roulette. Many firmware codebases go
further and stay heapless (heapless::Vec<T, N>, fixed pools), using
the allocator only during init. The same instincts — bound it, front-load it, measure
it — are what the jemalloc-on-a-server crowd applies at 10,000x the scale.
7 · A working checklist
- Default stance: the system allocator is fine. Swap only with a benchmark and a memory-profile in hand; jemalloc for long-running contended services and when you want its profiling, mimalloc for raw speed in a small package.
- Allocation-heavy hot path? Before reaching for a faster malloc, allocate less: reuse buffers (
clear()keeps capacity),with_capacityup front, arenas per request/phase. - Need to not die on OOM?
try_reserveat the points where size is attacker- or input-controlled; treat everything else as infallible. - Auditing memory? A wrapper allocator is 20 lines and works in production; jemalloc's heap profiling (
jemalloc_pprof) answers "what is holding 6 GB" properly. - Tracking the future:
allocator_api(#32838) for per-container allocators; the Rust-for-Linux fork ofallocshows what an all-fallible std would look like.
References
- std::alloc — GlobalAlloc, Layout, System, handle_alloc_error.
- RFC 1974: global allocators — the design of
#[global_allocator]. - allocator_api tracking issue (#32838) — the per-container Allocator trait, a decade in the making.
- Vec::try_reserve — fallible allocation on stable.
- tikv-jemallocator and mimalloc — the two usual swap-ins.
- bumpalo — the canonical arena allocator, with the design discussion in its docs.