How I Stopped Claude Code From Destroying My Codebase: A CLAUDE.md Template That

Leader posted 3 min read

I spent 33 minutes running an experiment where I accepted 10 Claude Code diffs in a row without reading any of them. Test coverage went from 90% to 0%. The codebase literally stopped importing.

After that wake-up call, I built a constraint system that keeps the agent useful without letting it wreck the project. Sharing the full setup here because every post I see about "AI ruined my code" is vibes, not tooling.

The problem in one sentence

AI coding agents are extraordinarily good at producing diffs that look right in isolation, and systematically bad at asking whether the decision fits the shape the project is trying to become.

Three specific failure modes I watched happen in real time:

  1. Test rewriting. The agent modifies your tests to agree with its new code. Green checkmark now means "the agent agrees with itself" — not "somebody verified the contract."
  2. Silent architecture drift. New folders, new import paths, new infrastructure services appear. No human reviews any of it.
  3. Latent import errors. The agent imports a library it never installs. You don't notice because you're busy accepting. Four prompts later, nothing runs.

The fix: CLAUDE.md + review gate

1. Drop this at your repo root as CLAUDE.md

# Agent Operating Rules

## Hard constraints — never violate
- Do not add new infrastructure (message brokers, task queues, cache layers, new services) without explicit user approval in the prompt.
- Do not modify, delete, or rewrite existing tests. If a test fails due to your changes, surface the failure — do not "fix" the test.
- Do not restructure existing modules or move files between directories.
- Do not introduce new top-level dependencies without stating them explicitly in your response.
- Do not touch more than one architectural layer (auth, storage, API, data) per prompt.

## Required behaviors
- After every code change, run the existing test suite. Report failures without fixing them unless asked.
- When adding a dependency, run the install command in the same turn. Never leave an import that points at an uninstalled package.
- When a prompt is ambiguous, ask one clarifying question before writing code.

## Scope discipline
- If a request would require violating any hard constraint, stop and explain why instead of proceeding.
- Prefer the smallest change that satisfies the request. Reject scope creep even if "while I'm here" feels natural.

Claude Code reads this automatically on every session start in the repo.

2. Add a pre-commit hook that blocks accept-storms

# .git/hooks/pre-commit
#!/bin/bash

CHANGED_FILES=$(git diff --cached --name-only | wc -l)
CHANGED_LINES=$(git diff --cached --numstat | awk '{s+=$1+$2} END {print s}')

if [ "$CHANGED_FILES" -gt 10 ] || [ "$CHANGED_LINES" -gt 400 ]; then
    echo "⚠️  Large commit detected: $CHANGED_FILES files, $CHANGED_LINES lines"
    echo "Review each hunk before committing. Run: git diff --cached"
    echo "To override: git commit --no-verify"
    exit 1
fi



chmod +x .git/hooks/pre-commit

This is deliberately annoying. It forces a pause — which is the whole point. When you're tired and accepting diffs on autopilot, this is the only thing between you and a time bomb.

3. Narrow your prompts

The single biggest mitigation isn't tooling — it's prompt scope. Instead of:

"Add authentication to the app"

Use:

"Add a POST /login endpoint in auth/routes.py. Use the existing User
model. Don't touch other files. Don't add new dependencies."

The narrower the scope, the less carelessness you inherit.

What this actually prevents

Running the same 10-feature experiment with CLAUDE.md + hook + narrow prompts:

  • Coverage stayed above 75% (down from 90%, not catastrophic)
  • No rewritten tests — agent flagged failures instead of hiding them
  • No unauthorized infrastructure — agent refused to add Celery without being asked
  • Zero latent import errors — every new dependency got installed same-turn

Why most teams won't do this

The uncomfortable truth: these three files take ten minutes to set up and they make the agent meaningfully slower to work with. Under deadline pressure, teams delete the hook, relax CLAUDE.md, and widen the prompts.

That's fine — just don't tell yourself you're shipping faster. You're shipping a mortgage.

Watch the unconstrained run

I recorded the full 33-minute experiment where I deliberately did none of this — including the moment pytest crashes.

Curious what everyone else's CLAUDE.md looks like. What rules did you learn the hard way?

More Posts

I’m a Senior Dev and I’ve Forgotten How to Think Without a Prompt

Karol Modelskiverified - Mar 19

How I Built a React Portfolio in 7 Days That Landed ₹1.2L in Freelance Work

Dharanidharan - Feb 9

TypeScript Complexity Has Finally Reached the Point of Total Absurdity

Karol Modelskiverified - Apr 23

I Wrote a Script to Fix Audible's Unreadable PDF Filenames

snapsynapse - Apr 20

Why Prompt Engineering Is Just an Expensive Way to Be Incompetent

Karol Modelskiverified - May 21
chevron_left

Related Jobs

View all jobs →

Commenters (This Week)

1 comment
1 comment
1 comment

Contribute meaningful comments to climb the leaderboard and earn badges!