Your Docs Are Lying. Mine Update Themselves.

Question

Your Docs Are Lying. Mine Update Themselves.

Lordhacker756

calendar_todayApr 21 • schedule3 min read

The Problem

Documentation doesn’t fail because people are careless.

It fails because of latency.

Code changes → docs updated later (maybe)
Systems evolve → docs stay frozen
New engineers → trust outdated information

Now scale that to reality:

33 microservices
Multiple chains (EVM, Solana, Starknet, XRPL…)
Independent deploy cycles
Zero centralised ownership

At that point, documentation isn’t just outdated.

It’s fiction.

What I wanted was simple:

A knowledge base that watches the codebase and updates itself.

What Even Is a Knowledge Base?

Before we get into the build, let's get the basics right.

A knowledge base is a structured, queryable collection of information about a system — not code, not logs, but understanding.

It answers questions like:

What does this service actually do?
What APIs does it expose?
How does data flow between components?
What changed last month — and why?

In a microservices architecture, a knowledge base typically includes:

Service docs — one file per service (purpose, APIs, models, behavior)
Architecture doc — how services interact and depend on each other
Sources registry — tracked repos, branches, last ingested commits
Activity log — a timeline of changes for audits and onboarding

Together, this becomes a single source of truth for both humans and AI agents.

The problem?

It doesn’t stay true for long.

The Architecture

The system has three core components:

    ┌──────────────┐
    │   Watcher    │
    └──────┬───────┘
           │ detects changes
           ▼
    ┌──────────────┐
    │ Ingestion AI │
    └──────┬───────┘
           │ updates docs
           ▼
    ┌──────────────┐
    │ Knowledge DB │
    └──────┬───────┘
           │ PR for review
           ▼
       Engineers

1. The Watcher

A lightweight server (built with Elysia) runs a cron job every hour.

It reads a SOURCES.md file:


| Service     | Repo | Branch | Last Commit |
| ----------- | ---- | ------ | ----------- |
| core-engine | ...  | main   | 9559715     |

For each service:

Fetch latest commit from Gitea
Compare with stored hash

If changed → mark for ingestion

Example logs:


[cron] ↑ core-engine: 9559715 → 393ca68 (changed)
[cron] ↑ core-daemon: bafe49e → e8b58de (changed)
[cron] = core-comms: e029062 (up to date)

````

---

### 2. The Ingestion Agent (The Brain)

This is where things get interesting.

For each changed repo:

1. Shallow clone
2. Generate a **targeted diff**
   ```bash
   git diff OLD_COMMIT NEW_COMMIT

3. Pass diff to an AI agent

---

#### Why Diff-Based?

Instead of reading entire codebases (slow + expensive), the agent sees only:

> **what actually changed**

This reduces:

* token usage (~10x reduction)
* noise
* irrelevant processing

---

#### What the Agent Actually Does

This is the magic part.

It doesn’t just “update docs.”

It **understands changes semantically**:

* Detects new API endpoints
* Identifies modified data models
* Tracks service interaction changes
* Ignores refactors, tests, formatting noise

Then it:

* Updates the specific service doc
* Updates `architecture.md` if flows changed
* Maintains consistency across the system

---

### 3. The Knowledge Base

Everything lives in a dedicated repo (`core-context`).

After processing:

core-context/
├── services/
│ ├── core-engine.md
│ ├── core-daemon.md
├── architecture.md
├── SOURCES.md
└── ACTIVITY_LOG.md


Workflow:

1. Create branch:

   ```
   auto/ingest-2026-04-21
   ```

2. Commit updates

3. Open PR:

ingest: update docs (2026-04-21)

Updated:

core-engine: 9559715 → 393ca68
core-daemon: bafe49e → e8b58de

Human review → merge

What It Looks Like in Practice

[Deploy happens]
        │
        ▼
[Watcher detects change]
        │
        ▼
[Repo cloned + diff generated]
        │
        ▼
[AI processes changes]
        │
        ▼
[Docs updated automatically]
        │
        ▼
[PR created]
        │
        ▼
[Team reviews + merges]

Time from code change → documentation update:

~1 hour

The Tricky Parts

Bun Compatibility

The Agent SDK spawns a subprocess.

Bun broke due to missing DNS APIs:

TypeError: q.addAddress is not a function

Fix: Force execution via Node.js

Token Limits

Early approach:

“Read entire repo”

Result:

Massive token usage
Rate limit issues

Fix: Diff-only ingestion
→ ~10x efficiency improvement

Environment Propagation

Overriding env in subprocess broke runtime:

Lost PATH
Missing configs

Fix: Let SDK inherit environment naturally

Why This Works

The key insight:

Documentation rot is a latency problem.

Approach	Latency	Result
Manual updates	Days/weeks	Outdated docs
“We’ll update later”	Infinite	Dead docs
This system	~1 hour	Always accurate

The Real Impact

This isn’t just “cool automation.”

It changes how teams operate:

New engineers read reality, not stale docs
No more “tribal knowledge bottlenecks”
System understanding scales with codebase size
Documentation becomes something you trust

Final Thought

I didn’t set out to build a documentation system.

I set out to eliminate the need for discipline in maintaining one.

Because the truth is:

The best documentation system isn’t the one people remember to update.
It’s the one that updates itself.

Built with Elysia, Bun, and the Anthropic Agent SDK. Running across 33 microservices in production.

1 Comment

🔥 Join developers growing publicly

Share your knowledge, build in public, and grow your developer presence with a global community.

Join CoderLegion

chevron_left

Utkarsh Mishra

2.8k Points • 35 Badges

India • linkedin.com/in/theutkarshmishra

11Posts

11Comments

20Connections

Flutter & React Native developer with 2+ years shipping production apps across fintech, Web3, and we... Show more

Commenters (This Week)

Contribute meaningful comments to climb the leaderboard and earn badges!

J.Bruni · Answer 1 · 2026-04-24T01:13:30+0000

This is cool, but feels risky without strict validation. How often do humans reject the PRs?

	AI Reliability Gap: Why Large Language Models are not for Safety-Critical Systems praneeth - Mar 31
	Your Service Desk Data Is Smarter Than You Think. AI Is Finally Proving It. Tom Smithverified - Jun 10
	Helping Clients Move from Pilot to Production: The Agentic AI Governance Playbook Tom Smithverified - Jun 8
	From Prompts to Goals: The Rise of Outcome-Driven Development Tom Smithverified - Apr 11
	AI Agents Don't Have Identities. That's Everyone's Problem. Tom Smithverified - Mar 13

Your Docs Are Lying. Mine Update Themselves.

The Problem

What Even Is a Knowledge Base?

The Architecture

1. The Watcher

What It Looks Like in Practice

The Tricky Parts

Bun Compatibility

Token Limits

Environment Propagation

Why This Works

The Real Impact

Final Thought

1 Comment

Please log in to add a comment.

Please log in to comment on this post.

More Posts

AI Reliability Gap: Why Large Language Models are not for Safety-Critical Systems

Your Service Desk Data Is Smarter Than You Think. AI Is Finally Proving It.

Helping Clients Move from Pilot to Production: The Agentic AI Governance Playbook

From Prompts to Goals: The Rise of Outcome-Driven Development

AI Agents Don't Have Identities. That's Everyone's Problem.

More From Lordhacker756

Building an AI-Native Cross-Chain Swap System with MCP, Claude Code, and Garden Finance

The Coordinated BLoC Pattern in Flutter — A Practical Guide

Optimizing Flutter Apps: A Deep Dive into Dart DevTools

Related Jobs

Commenters (This Week)

Welcome to Coder Legion

Connect with 4,568 amazing developers

Don't have an account? Sign up

OR

Your Docs Are Lying. Mine Update Themselves.

The Problem

What Even Is a Knowledge Base?

The Architecture

1. The Watcher

What It Looks Like in Practice

The Tricky Parts

Bun Compatibility

Token Limits

Environment Propagation

Why This Works

The Real Impact

Final Thought

1 Comment

Please log in to add a comment.

Please log in to comment on this post.

More Posts

More From Lordhacker756

Related Jobs

Commenters (This Week)