Safe Rust Garbage Collection

Leader posted Originally published at krun.pro 4 min read

Garbage Collection in Rust Without a Single unsafe Block

Most garbage collectors in the Rust ecosystem harbor a "dirty secret" in their source code: a hidden unsafe block that bypasses the borrow checker's guarantees. The conflict is clear—mapping complex object graphs with cycles and back-references directly onto Rust’s ownership model is a nightmare. However, achieving safe Rust garbage collection is not only possible but elegant. It simply requires a fundamental shift in architecture: swapping raw pointers for arena indices and letting the type system handle the heavy lifting that pointer arithmetic used to dominate.


Why Most Rust GCs Resort to unsafe

The Rust borrow checker is a masterpiece for managing stack-scoped, DAG-shaped data. But the second you introduce a cyclic structure—like a DOM tree, a scripting language heap, or a complex graph—the ownership model pushes back. You cannot have two nodes mutually referencing each other without messy interior mutability or manual pointer management. Many developers take the path of least resistance: using *mut T to bypass the checker entirely.

The core issue is the safety invariant. A GC must know if any live reference to an object exists during collection. With raw pointers, the type system is blind; you track references manually, and a single mistake leads to a use-after-free bug. My approach asks: can we build a collector where the type system itself distinguishes between a "rooted" handle and an "unprotected" one?

The Failure of Traditional Smart Pointers

Wrapping everything in Rc<RefCell<T>> is a common first attempt, but it fails at cycles. Because Rc relies on reference counting, two nodes pointing at each other will never hit zero, causing a permanent memory leak. A true mark-and-sweep algorithm requires a global view of the heap, something reference counting fundamentally cannot provide.

The Zero-Unsafe Solution: Arenas and Indices

The foundation of my zero-unsafe GC architecture is the arena. Instead of scattered heap allocations, we use a Vec<Slot<T>>. Each slot contains either a live object or a "tombstone." Every allocation returns a u32 index rather than a pointer. The Heap struct owns the vectors, and all access is funneled through heap.get(idx), which performs a safe bounds check. No pointer arithmetic, no lifetime trickery.

pub struct Heap {
objects: Vec<Slot<ObjectData>>,
free_list: Vec<u32>,
generation: Vec<u32>,
}

pub struct Gc {
index: u32,
generation: u32,
_marker: PhantomData,
}

impl Copy for Gc {}
impl Clone for Gc { fn clone(&self) -> Self { *self } }

In this model, Gc<T> is a lightweight, Copy-able handle. It doesn't carry ownership. If the collector reclaims the slot, the handle becomes "stale." We solve this using generation counters: each slot tracks how many times it has been reused. If the handle's generation doesn't match the slot's, the access safely returns None.

Rooting: The RAII Guard

To prevent active objects from being swept, we use a Root<T>. Unlike the "dumb" Gc<T> index, a Root implements Drop. When a Root is created, the index is added to the heap's root set; when it goes out of scope, it’s removed. The 'heap lifetime ensures you can't outlive the Heap itself, providing a compile-time guarantee of safety.

The Mark-and-Sweep Process

The mark phase starts at the root set and traverses the graph. Any type stored in the heap must implement a Trace trait, which exposes its internal Gc handles to the collector. We use a classic tri-color marking system:

  • White: Unvisited candidates for collection.
  • Grey: Visited but children not yet explored.
  • Black: Fully processed and live.
This is implemented entirely in safe Rust using a Vec<u32> as a worklist and a HashSet for marks.

Performance and the ABA Problem

Admittedly, index-based access is slower than raw pointers. A pointer dereference is nearly free on modern CPUs, while an index requires a base pointer load, an offset, and a bounds check. My benchmarks show that while pointer-based traversal takes ~2.1 ns/op, arena-indexed traversal takes ~3.8 ns/op. For many applications—like scripting engines or game ECS—this 80% overhead is a small price for total memory safety.

The Comparison

Metric Arc<RwLock<T>> Arena GC (Safe Rust)
Cycle Handling Leaks memory Full reclamation
Safety Standard RAII Generation checks + Bounds checks
Unsafe Usage None (in user code) Zero (in GC & user code)

By shifting from "where the object is in memory" to "where the object is in the arena," we align with Rust’s strengths. We trade a sliver of performance for a system that is impossible to crash, easy to audit, and 100% compliant with the rules of safe Rust.

More Posts

Mastering Java Memory Management: A Comprehensive Guide to JVM Internals, Garbage Collection, and Optimization

Aditya Pratap Bhuyan - May 6, 2025

Mastering Borrow Checking in Rust: The Key to Safe and Efficient Memory Management

Mike Dabydeen - Dec 6, 2024

The post compares G1, ZGC, and Shenandoah Java garbage collectors.

webMethodMan - Oct 16, 2025

I Built a Semantic File System in Rust to Eradicate the Directory Tree

Panav Payappagoudar - May 2

Rust Weekly Log — From Code to Production

Vincent - Apr 24
chevron_left

Related Jobs

View all jobs →

Commenters (This Week)

11 comments
1 comment
1 comment

Contribute meaningful comments to climb the leaderboard and earn badges!