The Control Group: Benchmarking the

The Control Group: Benchmarking the "Dirty" Version of My FSM API

Leader posted 2 min read

In software architecture, optimization without measurement is just guessing.

I am currently in the process of a major refactor for my C# Finite State Machine API (FSM_API). We discovered some concurrency friction in the underlying dictionary lookups and realized that my heavy reliance on string keys was creating unnecessary O(N) overhead.

The plan is to switch to ConcurrentDictionary and implement Integer Hashing to eliminate string allocations entirely. But before I tear the engine apart, I needed a baseline. I needed to know exactly how much "tax" I was paying for those strings.

So, I built a Stress Profiler.

The Torture Test

I designed a benchmark to intentionally break the system. The goal wasn't to see if it would fail, but how it would fail.

The test setup is simple but brutal:

  • The Crowd: We spawn between 50,000 and 240,000 Agents.

  • The Payload: Each agent performs basic math, some branching logic, and—crucially—a ToString() allocation every update to simulate
    memory pressure (Heap Thrashing).

  • The Variable: We test three different "Time-Slicing" intervals:

1.) Interval 1: Every agent updates every frame (100% Load).

2.) Interval 2: Agents skip 1 frame (50% Load).

3.) Interval 3: Agents skip 2 frames (33% Load).

The Hypothesis

My theory was that the Dictionary<string, ...> lookups would become the primary bottleneck as the agent count rose. Even when agents are "sleeping" (skipping frames), the API still has to iterate through the list to check if it's their turn.

If the architecture is sound, "Interval 2" should theoretically double our capacity. If it doesn't, we know the overhead of the list management itself is too high.

Here is the raw data from the "Dirty" version of the API—before the optimization pass which will be shared the moment I've accomplished the refactoring.

Stress Test Results (Baseline API)

Interval Agents FPS Logic Ops/Frame Mem Delta (MB)
1 50000 58.78 150,000 8
1 55000 106.30 11,000 27
1 60000 72.01 180,000 8
1 65000 100.64 13,000 30
1 70000 63.44 210,000 13
1 75000 96.77 15,000 38
1 80000 57.21 240,000 36
1 85000 87.40 17,000 43
1 90000 46.43 270,000 16
1 95000 75.70 19,000 48
1 100000 41.40 300,000 16
1 105000 62.54 21,000 54
1 110000 38.60 330,000 27
1 115000 58.02 23,000 59
1 120000 34.88 360,000 26
1 125000 57.30 25,000 61
1 130000 33.57 390,000 32
1 135000 53.90 27,000 68
1 140000 31.47 420,000 26
1 145000 50.50 29,000 70
1 150000 28.80 450,000 29
2 80000 70.34 8,000 55
2 85000 71.45 127,500 12
2 90000 83.53 13,500 45
2 95000 62.50 142,500 27
2 100000 69.49 15,000 -4
2 105000 59.55 157,500 23
2 110000 67.81 16,500 53
2 115000 53.22 172,500 40
2 120000 57.58 18,000 -1
2 125000 50.64 187,500 28
2 130000 56.93 19,500 64
2 135000 42.57 202,500 46
2 140000 48.11 21,000 2
2 145000 44.63 217,500 33
2 150000 48.52 22,500 74
2 155000 40.62 232,500 51
2 160000 45.80 24,000 87
2 165000 35.97 247,500 42
2 170000 38.35 25,500 4
2 175000 35.65 262,500 41
2 180000 40.20 27,000 86
2 185000 32.44 277,500 57
2 190000 37.83 28,500 93
2 195000 31.51 292,500 60
2 200000 33.02 30,000 2
2 205000 30.64 307,500 50
2 210000 34.37 31,500 100
2 215000 28.08 322,500 64
3 120000 46.49 6,000 79
3 130000 56.15 130,000 30
3 140000 49.74 14,000 -9
3 150000 47.27 150,000 21
3 160000 44.20 16,000 75
3 170000 39.76 170,000 35
3 180000 37.77 18,000 1
3 190000 37.70 190,000 32
3 200000 35.25 20,000 97
3 210000 32.49 210,000 57
3 220000 30.48 22,000 2
3 230000 31.41 230,000 37
3 240000 29.61 24,000 116

The Analysis

The results gave us the exact "Failure Floor" we were looking for.

The Hard Limit: At 140,000 agents running every frame, we hit the 30 FPS floor.

The Overhead: Notice that switching to Interval 2 didn't perfectly double our capacity (we capped around 215k, not 280k). This confirms that simply iterating through 200,000 string-based handles costs CPU time, even if the agents do nothing.

The GC Spike: The memory delta column shows the "Sawtooth" pattern where the Garbage Collector had to step in aggressively to clean up those string allocations.

Now we have our target. The next version of the API needs to beat these numbers to prove the value of the refactor. Let's get to work.

Resources & Code:
Simple Demos Repository (Run the Stress Test): https://github.com/TrentBest/SimpleDemos (Clone this repo and run the StressProfiler project to reproduce these numbers.)

The FSM Package (Unity Asset Store):
https://assetstore.unity.com/packages/slug/332450

NuGet Package (Non-Unity Core):
https://www.nuget.org/packages/TheSingularityWorkshop.FSM_API

GitHub Repository:
https://github.com/TrentBest/FSM_API

Support Our Work:

Patreon Page:
https://www.patreon.com/c/TheSingularityWorkshop

Support Us (PayPal Donation):
https://www.paypal.com/donate/?hosted_button_id=3Z7263LCQMV9J

We'd love to hear your thoughts! Please Like this post, Love the code, and Share your feedback or what you've built with the API in the comments.

The Singularity Workshop's Logo, a digital eyeball made from energy, forming the singularity

1 Comment

0 votes

More Posts

The Sentient Primitive: Waking Up Your Data with FSMs

The Singularity Workshop - Nov 29

FSM API (Advanced) for Unity

The Singularity Workshop - Oct 27

06_FSM_Modifier Deep Dive: Modifying Your FSMs at Runtime (From our Documentation)

The Singularity Workshop - Nov 14

The Quick Brown Fox 2: The Silent Hordes

The Singularity Workshop - Nov 30

raWWar: The Galactic RTS So Ambitious It Forced Us to Invent a New Technology

The Singularity Workshop - Oct 28
chevron_left