Internals · 06 / 11
Internals

Memory layout & #[repr]

By default, Rust promises nothing about where your struct's fields live — and uses that freedom to reorder them, shrink padding, and stash enum tags inside pointers. This page covers the alignment rules, what #[repr(C)], #[repr(transparent)] and friends actually guarantee, how enums are laid out, and the niche optimisation that makes Option<&T> exactly one word.

Long read · alignment and padding through enum niches and packed structs · references at the end


1 · Size, alignment, padding

Every type has a size and an alignment. Alignment is a power of two; a value's address must be a multiple of it (u32 is 4-aligned, u64 is 8-aligned on x86-64). A composite type's alignment is the maximum of its fields' alignments, and its size is always rounded up to a multiple of its own alignment — so that elements of an array [T; N] stay aligned with no gaps between them. Whatever can't be filled by fields is padding.

rust src/main.rs · measuring
use std::mem::{size_of, align_of};

struct Demo {
    a: u8,
    b: u32,
    c: u16,
}

fn main() {
    println!("size={} align={}", size_of::<Demo>(), align_of::<Demo>());
}

In C, declaration order is law, so this struct would be 12 bytes: a (1) + 3 padding + b (4) + c (2) + 2 trailing padding. Rust got it into 8. How is the next section.

2 · #[repr(Rust)] — the default, and deliberately unspecified

With no attribute (equivalently #[repr(Rust)]), the compiler may place fields in any order. Current rustc sorts roughly by decreasing alignment, so b: u32 goes first, then c: u16, then a: u8, plus one byte of tail padding — 8 bytes total. Two consequences:

  • You may not assume the order — not even that two instantiations of the same generic struct, or the same struct in two compilations, agree. Layout is allowed to depend on optimisation decisions.
  • You don't pay for declaration order. Order fields for the reader; the compiler packs them. (In C, the Demo above is a real 50% size bug.)

This freedom is load-bearing: it enables the niche optimisations in section 5 and lets the compiler tail-pack structs like (u32, u8) nested in other structs. The cost: repr(Rust) types must never cross an FFI boundary, be transmuted between "obviously identical" definitions, or be serialised by dumping bytes.

rust src/main.rs · field offsets, observed
use std::mem::offset_of;   // stable since 1.77

struct Demo { a: u8, b: u32, c: u16 }

fn main() {
    println!("a@{} b@{} c@{}",
        offset_of!(Demo, a), offset_of!(Demo, b), offset_of!(Demo, c));
}

3 · #[repr(C)] — declaration order, C rules

#[repr(C)] switches to the C ABI's algorithm: fields in declaration order, each at the next offset that satisfies its alignment, struct size rounded up to struct alignment. This is the only correct choice for FFI structs, for types you transmute, and for anything whose bytes you persist or hash.

rust src/ffi.rs
#[repr(C)]
struct Header {
    magic: u32,     // offset 0
    version: u8,    // offset 4
    _pad: [u8; 3],  // be explicit about padding in wire formats
    length: u64,    // offset 8
}                   // size 16, align 8

extern "C" {
    fn parse_header(h: *const Header) -> i32;
}
Padding bytes are uninitialised. Even in repr(C), reading a struct's padding (e.g. hashing the raw bytes, or write()-ing the struct to a socket) reads uninitialised memory. Wire formats should either make padding explicit as fields, or serialise field by field.

4 · #[repr(transparent)], packed, align(N)

#[repr(transparent)] applies to a struct with exactly one non-zero-sized field (plus optionally some 1-aligned ZSTs like PhantomData), and guarantees the wrapper has the same layout and ABI as that field — including how it's passed in registers. That last part matters: a newtype around i32 without transparent is not guaranteed to be passed to C like an i32. NonNull<T>, ManuallyDrop<T>, and UnsafeCell<T> in std are all transparent.

rust src/lib.rs · the newtype that's safe to hand to C
#[repr(transparent)]
pub struct Fd(i32);   // identical to i32 at the ABI level

extern "C" { fn close(fd: Fd) -> i32; }   // sound

#[repr(packed)] drops alignment to 1 (or packed(N) caps it at N) — no padding, but fields can land at unaligned addresses. Taking a plain reference to such a field is a hard error (E0793) since references promise alignment; you must use &raw const / ptr::read_unaligned. Use packed for parsing wire formats, never as a "free" space optimisation.

#[repr(align(N))] raises alignment — the common production use is padding a struct to 64 or 128 bytes so two atomics don't share a cache line (false sharing). crossbeam's CachePadded is exactly this.

5 · Enum layout and the niche optimisation

An enum with data is conceptually a tagged union: a discriminant plus storage big enough for the largest variant. Under repr(Rust) the compiler picks the smallest workable tag and may do much better than tag-plus-payload — because of niches.

A niche is a bit pattern that a type can never hold. &T is never null. bool is only ever 0 or 1. char never exceeds 0x10FFFF. NonZeroU32 is never 0. If an enum has one variant carrying such a type and the other variants carry no data, the compiler stores the discriminant inside the forbidden patterns and the tag vanishes:

rust src/main.rs · niches, measured
use std::mem::size_of;
use std::num::NonZeroU32;

fn main() {
    println!("&u8:                {}", size_of::<&u8>());                // 8
    println!("Option<&u8>:        {}", size_of::<Option<&u8>>());        // 8  <- None = null
    println!("Option<Box<u8>>:    {}", size_of::<Option<Box<u8>>>());    // 8
    println!("Option<NonZeroU32>: {}", size_of::<Option<NonZeroU32>>()); // 4  <- None = 0
    println!("Option<u32>:        {}", size_of::<Option<u32>>());        // 8  <- no niche, real tag
    println!("Option<bool>:       {}", size_of::<Option<bool>>());       // 1  <- None = 2
    println!("Option<Option<bool>>: {}", size_of::<Option<Option<bool>>>()); // 1 <- 254 niches left
}

This is why Option<&T> is the zero-cost null pointer: None is the all-zero pattern, Some(r) is the pointer itself. The standard library documents this as a guarantee for Option wrapping &T, &mut T, Box<T>, NonNull<T>, function pointers and the NonZero* family — so Option<extern "C" fn()> is the correct Rust type for a nullable C function pointer. For other types, niches are an optimisation you'll observe but must not rely on.

It composes further than Option. Result<(), Box<Error>> is one word. Vec's internal pointer is NonNull, so Option<Vec<T>> is the same 24 bytes as Vec<T> on today's rustc. The compiler can also use niches in nested fields of a variant's payload.

6 · Pinning an enum's layout: repr(u8), repr(C), repr(C, u8)

Three attributes turn enum layout from "compiler's business" into a contract:

  • #[repr(u8)] (or u16/i32/…) on a fieldless enum fixes the discriminant's type — what you want for protocol opcodes you cast with as.
  • #[repr(C)] on an enum makes the tag the C compiler's default enum int type. Fieldless only, in portable FFI.
  • #[repr(C, u8)] on a data-carrying enum (RFC 2195) lays it out as the C idiom: a repr(C) struct of a u8 tag followed by a union of the variants. This — not repr(C) alone — is the way to share tagged unions with C, and it disables niches.
rust src/ffi.rs · a tagged union C can read
#[repr(C, u8)]
enum Event {
    Quit,                       // tag 0
    Key { code: u32 },          // tag 1, then union payload
    Mouse { x: i32, y: i32 },   // tag 2
}
// Equivalent C:
//   struct Event { uint8_t tag; union { struct {uint32_t code;} key;
//                                       struct {int32_t x,y;} mouse; } u; };

7 · ZSTs and fat pointers — the layout oddballs

Zero-sized types. (), empty structs, PhantomData, and arrays of length 0 occupy no bytes. They are real types with real trait impls, but a Vec<()> never allocates and HashSet<K> being HashMap<K, ()> costs nothing per entry for the value. Generic code doesn't need to special-case them; the layout algorithm erases them.

Fat pointers. References to unsized types are two words: &[T] and &str carry pointer + length; &dyn Trait carries pointer + vtable pointer (covered in the trait objects deep dive). That's why size_of::<&u8>() is 8 but size_of::<&[u8]>() is 16.

8 · Inspecting and exploiting layout

shell terminal · ask the compiler
# Print the computed layout of every type (nightly):
$ cargo +nightly rustc -- -Zprint-type-sizes

print-type-size type: `Demo`: 8 bytes, alignment: 4 bytes
print-type-size     field `.b`: 4 bytes
print-type-size     field `.c`: 2 bytes
print-type-size     field `.a`: 1 bytes
print-type-size     end padding: 1 bytes

# This output is also the practical tool for "why is my future 16 KB":
# async fn state machines show up here with per-variant sizes.
  • Order enum variants by size. The enum is as big as its largest variant. A 200-byte error variant makes every Result 200 bytes — Box the big variant.
  • Use NonZero* in public types. You get the niche back: an id type wrapping NonZeroU64 makes Option<Id> free.
  • Don't fight repr(Rust) by hand-ordering fields — the compiler already does it. Hand-order (and lock with repr(C)) only when bytes are a contract.

References

Found this useful?