How We Built a 19-Agent AI Dev Team: foxdev and foxagentdev

Question

How We Built a 19-Agent AI Dev Team: foxdev and foxagentdev

Paulo Fox posted May 2 3 min read

Introduction

What if your code was audited by 10 specialized AI agents before every merge? What if another team of 9 AI agents could build, deploy, and monitor your infrastructure autonomously?

That's exactly what we built at Fox Digital: foxdev (the audit team) and foxagentdev (the build team). Together, they form a 19-agent AI development workflow that has shipped production-grade code — including a full fiscal API with 464 tests and a 93.4/100 quality score — in days instead of weeks.

foxdev v3.0 — The Audit Team (10 Agents)

foxdev is a team of 10 specialized LLM-powered agents, each responsible for a specific audit domain. Every code change passes through all 10 agents before it can merge.

ag01 Code Review: style consistency, design patterns, anti-patterns, DRY violations
ag02 Test Coverage: finds untested code paths, suggests edge cases, runs mutation testing
ag03 Security: OWASP checks, injection vectors, auth flaws, replay attack surfaces
ag04 Debug: traces error paths, identifies silent failures, validates retry logic
ag05 Refactor: identifies fat services, extracts traits, enforces Single Responsibility
ag06 Docs: PHPDoc coverage, README quality, ADR completeness, changelog accuracy
ag07 Performance: N+1 queries, missing indexes, cache opportunities, query plan optimization
ag08 Compliance: LGPD data handling, fiscal law compliance, data retention rules
ag09 Observability: structured logging, metrics instrumentation, tracing, alerting coverage
ag10 FinOps: cloud cost tracking, resource optimization, billing accuracy

FOXVERIFY Scoring

Each agent produces findings and a domain score. FOXVERIFY consolidates them into a single 0–100 quality score. The threshold for merge is 90.

Score evolution on FOX NF-e:

Version	Score
v1.0	70.0
v1.1	77.0
v1.4	93.0
v2.0	76.0
v2.1	77.9
v2.2	82.4
v2.3	84.9
v2.4	91.7
v2.5	93.4

We started at 70/100 on day one. After 9 audit-rework iterations over 3 days, we reached 93.4/100 with zero open findings.

foxagentdev — The Build Team (9 Agents + 8 Hooks)

foxagentdev is the autonomous build side. While foxdev audits, foxagentdev writes code, runs tests, fixes issues, and deploys.

The system runs on 9 specialized build agents with 8 lifecycle hooks:

SecurityScan — runs security checks before any code is committed
QualityCheck — pre-commit quality gate
Stop — emergency brake for unexpected states
PostToolUse — cleans up after each tool execution
memory-check — validates FoxMemory context before building
environment-selector — routes tasks to the correct environment (dev/staging/prod)
skill-router — assigns the right skill to the right agent
Plus one additional orchestration hook

foxagentdev manages 18 skills across the entire ecosystem, with cron jobs running foxpresence GEO visibility (scoring 100/100 in our internal benchmarks).

The two systems coexist via COEXISTENCE.md — a protocol that prevents foxdev (auditor) and foxagentdev (builder) from conflicting during parallel execution.

The Workflow

Developer writes a DSPy-format prompt (15–25 lines max — structured, not free-form)
foxagentdev builds the feature: writes code, tests, documentation
foxdev audits the result via FOXVERIFY
Score < 90? Automatic rework loop — foxagentdev addresses all findings
Score >= 90? Merge to main
Every merge: FoxMemory saves lessons learned for continuous improvement

The key insight is that DSPy-format prompts dramatically outperform free-form prompts for agent tasks. Structured prompts reduce ambiguity, enabling agents to operate more autonomously and produce more consistent results.

Real Results: FOX NF-e Project

Everything in the following list was built using the foxdev + foxagentdev workflow:

464 automated tests, 1,109 assertions — built in 4 days
93.4/100 quality score via FOXVERIFY
0 open findings at ship time
24 MCP tools implemented (Streamable HTTP, JSON-RPC 2.0)
5,571 municipalities enriched with NFSe provider mapping
Full LGPD Art. 18 compliance endpoints
Prometheus metrics + webhook replay protection
Automatic contingency: EPEC + SVC-AN + SVC-RS + FS-DA

Benefits

Code quality goes up every iteration — measurable, not subjective (you have a score)
Security issues caught before production — ag03 runs on every PR
Documentation stays current — ag06 enforces it automatically
Compliance is proactive — ag08 flags LGPD issues before they become violations
New developers onboard faster — everything is documented, tested, and explained
Cost: ~$30–80/month in LLM API calls for the entire audit pipeline

Lessons Learned

Specialized agents outperform generalist agents — 10 focused agents beat 1 "do everything" agent every time
Scoring creates accountability — a number forces honesty about code quality
Memory systems prevent regressions — FoxMemory means mistakes get made once, not repeatedly
Human-in-the-loop for architecture, AI for implementation — the right division of responsibility
DSPy-format prompts are a game changer — structured prompts produce more reliable, reproducible results

Author

Paulo Fox — CEO at Fox Digital. Former contractor for SpaceX (2014–2017) and Google (2019–2022). MIT AI Strategy & GenAI. Creator of FOX NF-e, foxdev, and foxagentdev.

chevron_left

Commenters (This Week)

Contribute meaningful comments to climb the leaderboard and earn badges!

	I’m a Senior Dev and I’ve Forgotten How to Think Without a Prompt Karol Modelskiverified - Mar 19
	How I Built a React Portfolio in 7 Days That Landed ₹1.2L in Freelance Work Dharanidharan - Feb 9
	Breaking the AI Data Bottleneck: How Hammerspace's AI Data Platform Eliminates Migration Nightmares Tom Smithverified - Mar 16
	Your AI Agent Skills Have a Version Control Problem snapsynapseverified - Apr 22
	Everyone says DeepSeek is cheaper, but I got tired of guessing the exact math. So I built a calculat abarth23 - Apr 27

How We Built a 19-Agent AI Dev Team: foxdev and foxagentdev

Introduction

foxdev v3.0 — The Audit Team (10 Agents)

FOXVERIFY Scoring

foxagentdev — The Build Team (9 Agents + 8 Hooks)

The Workflow

Real Results: FOX NF-e Project

Benefits

Lessons Learned

Author

0 Comments

Please log in to comment on this post.

More Posts

I’m a Senior Dev and I’ve Forgotten How to Think Without a Prompt

How I Built a React Portfolio in 7 Days That Landed ₹1.2L in Freelance Work

Breaking the AI Data Bottleneck: How Hammerspace's AI Data Platform Eliminates Migration Nightmares

Your AI Agent Skills Have a Version Control Problem

Everyone says DeepSeek is cheaper, but I got tired of guessing the exact math. So I built a calculat

More From Paulo Fox

FOX NF-e: Brazil's First MCP-Enabled Fiscal API — 24 Tools, 5,571 Municipalities

Related Jobs

Commenters (This Week)

Welcome to Coder Legion

Connect with 4,340 amazing developers

Don't have an account? Sign up

OR

How We Built a 19-Agent AI Dev Team: foxdev and foxagentdev

Introduction

foxdev v3.0 — The Audit Team (10 Agents)

FOXVERIFY Scoring

foxagentdev — The Build Team (9 Agents + 8 Hooks)

The Workflow

Real Results: FOX NF-e Project

Benefits

Lessons Learned

Author

0 Comments

Please log in to comment on this post.

More Posts

I’m a Senior Dev and I’ve Forgotten How to Think Without a Prompt

How I Built a React Portfolio in 7 Days That Landed ₹1.2L in Freelance Work

Breaking the AI Data Bottleneck: How Hammerspace's AI Data Platform Eliminates Migration Nightmares

Your AI Agent Skills Have a Version Control Problem

Everyone says DeepSeek is cheaper, but I got tired of guessing the exact math. So I built a calculat

More From Paulo Fox

FOX NF-e: Brazil's First MCP-Enabled Fiscal API — 24 Tools, 5,571 Municipalities

Related Jobs

Commenters (This Week)