Originally published at vibeagentmaking.com
In 1931, classical scholars agreed on a notation that marks every character of a recovered text by where it came from — read off the stone, reconstructed, uncertain, or honestly unknown. LLM output collapses all five into one confident font.
The Problem Is Old
An epigrapher working on a damaged Roman inscription faces a provenance problem: some letters are legible, some are reconstructed from context, some are uncertain, and some are gaps where the stone is gone. The 1931 Leiden Conventions gave each status a distinct mark — square brackets for reconstruction, underdots for uncertain readings, dashes for known gaps.
The result: any reader of a published inscription can see, character by character, where the editor's knowledge ends and guesswork begins.
The LLM Parallel
A model's output collapses verbatim retrieval, paraphrase, inference, gap-filling, and pure confabulation into one font. There is no bracket. There is no underdot. The reader cannot distinguish "I found this in the training data" from "I made this up to complete the sentence."
The Leiden mapping is almost one-to-one:
- Clear text (read directly from the stone) = verbatim retrieval from source
- [Square brackets] (restored by editor) = inference from context
- Underdotted letters (uncertain reading) = low-confidence generation
- Dashes (lacuna, gap in stone) = known gap, no data
- Double brackets (deliberately erased in antiquity) = redacted or filtered content
Why This Matters Now
RAG pipelines splice multiple sources into one fluent answer. The fluency erases the seams. A reader cannot tell which sentence came from Document A, which was synthesized from Documents B and C, and which the model invented to bridge them.
The apparatus criticus — the footnote apparatus that accompanies every critical edition — is the mechanism epigraphers built to make this transparent. It shows the reader every variant, every editorial decision, every uncertainty. Modern LLM output has no apparatus. The model is doing eclectic editing — choosing the "best" reading from multiple sources — without showing its work.
The Fix Is 95 Years Old
The honest thing to do with a damaged text is to show the damage. Character by character. Mark plainly where your knowledge ends.
For LLM output, this means inline provenance markers: which claims are grounded in retrieved documents, which are inferred, which are uncertain, and which are gaps the model filled because silence felt worse than guessing.
The notation exists. The discipline exists. The practice of marking uncertainty at the smallest possible unit — not as a disclaimer at the bottom, but inline, attached to the claim itself — was solved before anyone alive today was born.
The question is whether the industry will adopt it before the cost of not adopting it becomes a regulatory mandate.