Prompt → RAG → MCP → Agent → Harness, and What?

Prompt → RAG → MCP → Agent → Harness, and What?

Leader posted 7 min read

This is the short version.

The full essay goes further into the parts this version leaves out:
the broader reading of the Claude Code leak, the independence problem,
the speed problem, and the harder question of what real governance
infrastructure would actually have to do over time.

If you want the longer argument, read the full version here: Click


Quick glossary

  • Harness — the software layer that decides what the model can see, what tools it can use, and what it is allowed to do
  • MCP — the connection layer between models, tools, and outside systems
  • Fail-closed — when a system is uncertain, it stops instead of continuing silently
  • Drift — when a system that used to be safe slowly becomes unsafe as the surrounding context changes

Where We Left Off

The previous report on the Claude Code leak made one central claim:

The model is not the whole product.

The harness is.

Context handling. Tool routing. Permissions. Continuity. Recovery. Cost discipline.

That is where production performance actually lives.

But the leak raised a harder question.

If the harness is where the real product lives, then who governs the harness?

That is where this piece begins.


What the Leak Actually Showed

On March 31, 2026, a 59.8 MB JavaScript sourcemap shipped inside the Claude Code npm package. Within hours, 512,000 lines of TypeScript were mirrored across GitHub.

A flood of analysis followed.

Much of it focused on hidden prompts, secret techniques, or line-by-line code archaeology. That was understandable. People wanted to know how the system worked.

What interested us was different.

Two findings mattered most because they did not reveal magic. They revealed silent operational failure.

1. Alex Kim: silent failure at scale

Alex Kim's analysis surfaced an internal BigQuery data point showing repeated compaction failures at meaningful scale. The technical fix was small.

The more important fact was simpler: the system had been failing silently long enough that someone had to pull internal data to notice it. 1

The system did not halt.

It continued.

2. Adversa AI: guardrails failing in sequence

Adversa AI found that once a pipeline exceeded 50 subcommands, deny rules stopped running and security validators were skipped without clearly surfacing that failure to the developer. [2]

Again, the important point was not only the bug itself.

It was that the primary boundary between the agent and the system could fail silently.

Taken one by one, both incidents can be read as implementation mistakes.

Taken together, they show something bigger.

A mature harness, built by a serious team, can operate for some time while its own exceptions remain invisible inside the system itself.

That is the bridge to governance.

Once the question is no longer whether the harness exists, but who is watching it, the discussion moves beyond bugs and into control, oversight, and accountability.


Why This Becomes a Governance Problem

The industry response was predictable.

Governance became the next keyword. New leads appeared. New frameworks appeared. Vendors quickly began talking about policy layers and governance products.

But two deeper problems remained.

1. The independence problem

The same companies building the agent are often the ones defining, running, and auditing the controls around it.

That is useful for first-line safety.

It is not the same as independent governance.

A control built inside the runtime inherits the runtime's incentives and failure modes. It is designed first to help the system run. It is not designed first to challenge the system.

That does not make vendor controls worthless.

It simply means they are not enough on their own.

2. The speed problem

Even when the industry recognizes the issue, most governance still operates at human speed.

Committees move slowly. Frameworks take months. Formal standards take longer.

The attack surface does not wait.

OpenAI publicly acknowledged that prompt injection may never be fully solved for browser-based agents, and described automated red-teaming that found attack classes human researchers had not previously identified. [3]

A joint paper from researchers at OpenAI, Anthropic, and Google DeepMind showed that adaptive attacks bypassed all 12 tested published defenses, often with success rates above 90%. [4]

That is why this is not only a policy problem.

It is a speed problem.

Human review still matters.

But human review alone cannot stop machine-speed failure.


What Real Governance Needs

If governance has to be both independent and fast, the requirements become clearer.

It needs at least three things.

1. Policy-as-Code outside the runtime

The rules cannot live only inside the same context window they are meant to constrain.

A policy file the agent can read is useful.

It is not enough.

If the system can reinterpret or override the rule from inside its own working surface, that rule is guidance, not governance.

As Simon Willison argued, once an AI system has access to private data, untrusted content, and external action, you have a lethal trifecta. [5]

Any governance layer that lives entirely inside that trifecta inherits the same danger.

2. A tamper-evident record

Enforcement means very little if the system can rewrite its own history.

Logs are helpful.

But ordinary logs are still inside the system.

What matters is a record written somewhere the agent cannot silently modify, with evidence that the record has not been altered after the fact.

That is what turns "we think it behaved correctly" into something that can actually be checked.

3. Fail-closed by default

When uncertainty appears, the system should not quietly continue.

It should stop.

That sounds simple, but it is not the cultural default in most agent systems today.

The Claude Code incidents pointed the other way: uncertainty appeared, and the system kept going.

Real governance has to reverse that instinct.


What Is Already Possible — and What Still Isn't

This is where the conversation often gets confused.

Some people hear "governance infrastructure" and assume this is all future work.

It is not.

Parts of it are possible right now.

Policy-as-Code already exists in mature form outside the AI world. Open Policy Agent is the clearest example. [8]

Tamper-evident logging is also no longer a fantasy. Nitro and newer cryptographic evidence work show that the technical foundations are already here. [11][12]

Fail-closed is not mainly a research problem. It is mostly a design choice.

The hardest remaining problem is drift.

A rule may be correct when it is written.

Six months later, the system around it may have changed.

A safe path becomes a dangerous one. An internal channel becomes external. A model update changes how ambiguity is resolved.

Nothing in the rule text changes.

But what the rule means in practice does.

That is drift.

And that is the part the industry still does not solve cleanly.

So the practical answer today is not complicated.

  • implement Policy-as-Code now
  • create a record the system cannot quietly rewrite
  • default to fail-closed where uncertainty matters
  • and accept that drift still needs periodic human revalidation

Not because that is perfect.

Because it is better than pretending the problem has already been solved.


The Actual Answer to the Title

Prompt.

RAG.

MCP.

Agent.

Harness.

And what?

Not governance as branding.

Not a framework PDF.

Not a vendor-bundled policy page.

The next serious layer is governance infrastructure.

Infrastructure that is independent enough to challenge the system, fast enough to matter during execution, and durable enough to leave a record the system cannot quietly rewrite.

That is where the next durable moat may live.

Not because governance sounds important.

Because a system cannot fully govern itself.


"Show us the record."


That demand is coming from more than one direction.

From insurers who want evidence before covering AI-assisted workflows.

From enterprise buyers who no longer trust smooth demos on their own.

From regulators whose timelines are moving closer.

The record either exists or it does not.

Build the infrastructure that produces it before someone else decides what that record should contain.


A practical note

If your team is working on agent governance in practice — not as a policy memo, but as real control logic, audit design, and reviewable implementation — our team at Flamehaven would be glad to help.

No advertising angle.
No commission pitch.
No hype.

Our work is grounded in serious engineering: governed systems, careful code, original control logic, and architecture shaped by sustained implementation work and accumulated operational evidence over the past year.

If you are thinking through governance design, collaboration, or code-level development in this direction, feel free to DM me.

We are open to technical discussion, early collaboration, and helping teams turn governance ideas into working systems.


References

  1. Alex Kim — "The Claude Code Source Leak: fake tools, frustration regexes, undercover mode"
    https://alex000kim.com/posts/2026-03-31-claude-code-source-leak/

  2. Adversa AI — "Critical Vulnerability in Claude Code Emerges Days After Source Leak"
    https://www.securityweek.com/critical-vulnerability-in-claude-code-emerges-days-after-source-leak/

  3. CyberScoop — "OpenAI says prompt injection may never be solved for browser agents like Atlas"
    https://cyberscoop.com/openai-chatgpt-atlas-prompt-injection-browser-agent-security-update-head-of-preparedness/

  4. Nasr, Carlini, Sitawarin et al. — "The Attacker Moves Second: Stronger Adaptive Attacks Bypass Defenses Against LLM Jailbreaks and Prompt Injections"
    https://arxiv.org/abs/2510.09023

  5. Simon Willison — "The lethal trifecta for AI agents"
    https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/

  6. Open Policy Agent — CNCF Project
    https://www.openpolicyagent.org

  7. Zhao et al. — "Rethinking Tamper-Evident Logging: A High-Performance, Co-Designed Auditing System"
    https://arxiv.org/abs/2509.03821

  8. Kao et al. — "Constant-Size Cryptographic Evidence Structures for Regulated AI Workflows"
    https://arxiv.org/abs/2511.17118

More Posts

I’m a Senior Dev and I’ve Forgotten How to Think Without a Prompt

Karol Modelskiverified - Mar 19

Systems Thinking: Thriving in the Third Golden Age of Software

Tom Smithverified - Apr 15

Agent Action Guard

praneeth - Mar 31

Architecting a Local-First Hybrid RAG for Finance

Pocket Portfolioverified - Feb 25

The End of Glue Code: Why MCP Is the USB-C Moment for AI Systems

Ken W. Algerverified - Apr 7
chevron_left

Related Jobs

Commenters (This Week)

3 comments
1 comment
1 comment

Contribute meaningful comments to climb the leaderboard and earn badges!