Rust Generator Yield

Leader posted Originally published at krun.pro 8 min read

Rust Generator yield: What the Compiler Actually Builds Under async/await

Every async fn you've ever written in Rust compiles down to something you probably never asked to see. The Rust generator yield mechanism isn't an exotic nightly toy — it's the actual substrate that async/await desugars into, and the compiler generates state machines from it that can quietly balloon your binary by hundreds of kilobytes. Most devs ship this without knowing it exists.


TL;DR: Quick Takeaways

  • Every Rust async fn desugars into a generator-backed state machine — a compiler-generated enum with one variant per suspension point.
  • The state machine size grows with nesting depth: a real-world async chain can produce enum variants exceeding 400 KB (Tweede golf, 2024 measurement).
  • Generators behind #![feature(coroutines)] are still nightly-only in 2025; stable workarounds exist but carry trade-offs.
  • Python's yield and Rust's yield look identical syntactically — the execution contracts are completely different at the machine level.

What the Rust Compiler Does with yield: A State Machine You Never See


When rustc compiles an async function, it runs a desugaring pass that converts suspension points into enum variants. Each .await call becomes a potential yield point; the compiler assigns each one a unique state index and generates a match arm in the polling loop. What you get is not a thread, not a closure, not a callback — it's a resumable enum that the executor pokes with poll() until it returns Poll::Ready.

The struct that wraps this enum implements Future. Inside sits the compiler-generated state machine with fields for every local variable that's alive across a suspension point. That's the part nobody warns you about: variables don't get dropped at the .await boundary — they get saved into the enum variant. If you have a fat buffer alive when you hit .await, that buffer now lives in the enum. Permanently. Until the future resolves.


// What you write:
async fn fetch_data(url: &str) -> Vec {
let response = client.get(url).await; // suspension point 1
let body = response.bytes().await; // suspension point 2
body.to_vec()
} // What the compiler roughly generates (simplified MIR view):
enum FetchDataStateMachine<'a> {
State0 { url: &'a str }, // before first await
State1 { url: &'a str, response: Response }, // between awaits
State2 { body: Bytes }, // after second await
Complete,
}


The mini-analysis here is straightforward but the implications aren't. State1 holds both url and response simultaneously because the borrow checker needs to prove both are valid at that point. The enum variant is sized to the largest variant — so if Response is 1 KB, the entire future allocates for that worst case even if most code paths never hit State1. This is where rust async function memory layout becomes a production concern, not a trivia question.

The Jump Table and MIR Output


The state machine dispatches via a jump table internally — the executor calls poll(), the match expression on the state index branches to the right arm, execution resumes from the exact suspension point. This is cooperative scheduling at zero OS overhead: no context switch, no kernel involvement, no stack swap. The Waker mechanism handles re-scheduling by registering a callback that the executor fires when the awaited I/O is ready.

If you want to actually see what rustc generates, run cargo rustc -- --emit=mir on a small async function. The MIR output is verbose but legible — you'll see the generator drop glue, the state enum definition, and the GeneratorResumeArgument plumbing. It's ugly in the good way: no magic, just a lot of autogenerated match arms.

Rust Generator vs Python Generator: Same Word, Different Contract


Python developers picking up Rust often assume yield works the same way. It doesn't — and the difference matters enough to trip up experienced engineers. In Python, yield suspends a function and returns a value to the caller; the caller can push a value back in via gen.send(value). The generator maintains its own implicit stack frame that Python manages for you. You don't think about lifetimes. You don't think about pinning. You don't think about drop order.

// Python mental model:
def counter(start):
n = start
while True:
received = yield n # yields n, receives next value
n = received or n + 1 gen = counter(0)
next(gen) # → 0
gen.send(10) # → 10 // Rust nightly equivalent (coroutines feature): ![feature(coroutines, coroutine_trait)] use std::ops::{Coroutine, CoroutineState};
use std::pin::Pin; let mut gen = #[coroutine] |initial: i32| {
let mut n = initial;
loop {
let received = yield n; // type of received = i32
n = received;
}
}; match Pin::new(&mut gen).resume(0) {
CoroutineState::Yielded(v) => println!("{v}"), // 0
CoroutineState::Complete(v) => {}
}


The structural difference: Python generators are heap-allocated, garbage-collected, and implicitly pinned. Rust coroutines are stack-allocated by default, borrow-checker-validated, and require explicit Pin<&mut Self> to resume. The StopIteration exception in Python maps to CoroutineState::Complete in Rust — same conceptual role, completely different mechanism. Python devs learning Rust generators shouldn't expect a soft landing here.

The send() Asymmetry


Python's gen.send(value) is two operations fused: it resumes the generator AND injects a value that becomes the result of the yield expression. Rust's resume(arg) works the same way, but the type of the resume argument is part of the coroutine's type signature — the compiler enforces it. You can't accidentally send a string to a coroutine that expects an integer. Python discovers that at runtime; Rust refuses to compile it. Whether that's "better" depends on how much you enjoy type errors at 2 AM.

Why async/await in Rust Is a Generator in Disguise


This isn't metaphorical. Before Rust stabilized async/await syntax, the internal implementation literally used std::future::from_generator() — a function that wrapped a coroutine-like closure into something that implements Future. The wrapper was called GenFuture. You can find it in old Rust source and in pre-1.36 nightly code. The async keyword was syntactic sugar over this machinery from day one.

// Pre-stabilization desugaring (historical, ~Rust 1.36):
// async fn example() -> i32 { ... }
// compiled roughly to:
fn example() -> impl Future {
from_generator(#[coroutine] || {
// body with yields instead of awaits
let x = yield some_future;
x + 1
})
} // Modern compiler does this internally but hides the plumbing.
// The GenFuture wrapper still exists in the stdlib — it's just not public API.


The from_generator wrapper implements Future::poll by calling Coroutine::resume with a Poll-compatible argument. When the coroutine yields, it maps to Poll::Pending. When it completes, it maps to Poll::Ready. The Waker gets threaded through context via thread-local storage in the poll_with_tls_context helper. This is the async state machine — not an abstraction over it, not analogous to it. It is it.

The Stackless Architecture Consequence


Because Rust async uses stackless coroutines — meaning each suspended future stores only the variables it actually needs, not an entire call stack — you can have thousands of concurrent futures in a single thread without proportional memory cost. A Tokio application handling 10,000 simultaneous connections doesn't maintain 10,000 stacks. Each future sits as a struct on the heap, sized exactly to its state machine. This is the zero-cost abstraction claim made concrete: you pay for what you suspend, nothing else.

Rust Generator on Stable: Patterns That Don't Need Nightly


Here's the uncomfortable reality: the #![feature(coroutines)] gate has been open since RFC 2033 landed in 2016, and as of 2025 it's still not stable. The RFC got renamed, partially redesigned, merged with gen block proposals under RFC 3513, and the stabilization timeline remains "when it's ready." If you're shipping production Rust and need generator-like behavior today, you work around the gate.

// Stable pattern 1: std::iter::from_fn
// Generates an infinite Fibonacci sequence without nightly
let mut state = (0u64, 1u64);
let fibs = std::iter::from_fn(move || {
let next = state.0 + state.1;
state = (state.1, next);
Some(state.0)
}); // Stable pattern 2: genawaiter crate (0.99.x)
// Provides gen!() macro that emulates generator syntax on stable
use genawaiter::{sync::gen, yield_}; let mut generator = gen!({
yield!(1u32);
yield
!(2u32);
yield_!(3u32);
}); // Stable pattern 3: Rust 2024 gen blocks (RFC 3513)
// Available in nightly as of late 2024, stabilization target: 2025 edition
// gen { yield 1; yield 2; } → implements Iterator directly


std::iter::fromfn is the zero-dependency stable answer for simple lazy sequences — you manually maintain state in a captured variable. It works, it's readable, and it compiles on stable going back years. The genawaiter crate goes further: it provides yield!() macro syntax that feels like real generators and works on stable Rust by implementing the state machine explicitly. The gen {} block from RFC 3513 is the official future — it targets the Iterator trait directly and sidesteps the coroutine/generator naming debate entirely.

The Hidden Cost: How Generator State Size Grows with Nesting


The state machine size problem is real and it bites in production. Tweede golf (a Dutch embedded systems consultancy) published measurements in 2024 showing that a moderately complex async function chain in an embedded Rust application generated a future struct exceeding 400 KB. That's a single future. On a microcontroller with 512 KB of RAM total, that's not a performance problem — it's a won't-compile problem.

The mechanism is straightforward once you understand enum layout. A Rust enum is sized to its largest variant. If you have an async function that calls three other async functions sequentially and holds local state between each call, the compiler generates an enum where the largest variant contains all the nested futures plus all the local variables. Those nested futures are themselves enums sized to their largest variants. The size compounds multiplicatively, not additively. Deeply nested async call chains can produce state machines that dwarf the actual code doing work.

The fix is Box::pin(): boxing a future puts it on the heap and stores only a pointer in the state machine. Box> costs one heap allocation and pointer indirection per await point, but the state machine size becomes constant regardless of what's inside. For embedded targets or size-sensitive code, this trade-off is often worth it. For throughput-sensitive server code, it's sometimes a regression. Profile before deciding.

The full deep-dive on Rust Generator yield - https://krun.pro/rust-generator-yield/

More Posts

I Built a Semantic File System in Rust to Eradicate the Directory Tree

Panav Payappagoudar - May 2

Rust Weekly Log — From Code to Production

Vincent - Apr 24

Safe Rust Garbage Collection

Krun Dev - Apr 22

Rust Foundations — The Stuff That Finally Made Things Click

Lordhacker756verified - Apr 4

Rust Weekly Log — Weekly progress snapshot

Vincent - Mar 27
chevron_left

Related Jobs

View all jobs →

Commenters (This Week)

5 comments
1 comment

Contribute meaningful comments to climb the leaderboard and earn badges!