I built AgentKit because my AI coding agent kept failing — here's what I learned

I built AgentKit because my AI coding agent kept failing — here's what I learned

posted 2 min read

The frustration that started everything

I'd been using Claude Code and OpenCode daily for months. The models were impressive — genuinely capable of understanding complex codebases, writing solid logic, catching bugs I missed. But there was a pattern I couldn't shake: give it a real task, something with more than two or three steps, and it would fall apart halfway through. Not because it was dumb. Because it had no discipline.

It would skip reading the relevant files and just start writing. It would forget what it decided three prompts ago. It would use the most expensive model for a task that needed a two-line answer. Every session felt like handing the wheel to someone brilliant who had no memory of the drive so far.

Task completion — on anything non-trivial — hovered around 20%. That's not a model problem. That's a structure problem.

What I actually built

AgentKit is an open-source workflow layer that sits on top of AI coding agents and gives them what they're missing: a memory, a plan, and a process they have to follow before touching your code.
It has five core layers:

  • The Intelligent Skill Router : classifies every prompt and injects only the relevant skills into context — cutting token usage
    by ~89% per session. The agent stops reading 45,000 tokens of
    documentation it doesn't need and reads the 5,000 it does.

  • The Project Memory Graph: is a SQLite-backed knowledge graph that records every file, function, API route, and architectural
    decision across sessions. When you start a new session, AgentKit
    already knows what you were building, what you decided, and why.

  • The Token Budget Intelligence layer :automatically routes simple tasks to cheaper models and complex ones to powerful models. Against
    an all-Sonnet baseline, it cuts costs by around 60%. The Workflow
    Engine is the most important piece. It enforces a strict Research →
    Plan → Execute → Review → Ship state machine. The agent literally
    cannot edit a file without an approved plan. No shortcuts. No jumping
    ahead.

  • The Universal Platform Layer means none of this is locked to one tool. AgentKit installs across 11 platforms — Claude Code, OpenCode,
    Hermes, and more — with a single command.

The moment the benchmark landed

I ran the same tasks with the same model — Gemma 4 31b — with and without AgentKit active. Same prompts, same codebase, same evaluation criteria.

Without AgentKit: ~20% task completion.
With AgentKit: ~80%.

I ran it twice because I didn't believe it the first time. The difference is entirely structural. Planning before execution changes everything. Not the model. The process.

Where it is now:

AgentKit is live, open source, and published on npm. We're at v0.5.x — early, but stable enough that developers are cloning it daily and the Skill-Sync Bridge is already auto-generating new skills from session experience.

The project is also starting to get attention from the developer community, including a benchmark writeup that landed on Dev.to.

The preview version (agentkit-preview) goes further — a Telegram bridge for 24/7 remote agent control from your phone, an approval gate system so the agent can never auto-execute without your confirmation, and a self-improving skill library that grows from your own sessions.

Try it

npx agentkit-ai@latest init

GitHub: https://github.com/Ajaysable123/AgentKit

1 Comment

2 votes
1

More Posts

How I Built a React Portfolio in 7 Days That Landed ₹1.2L in Freelance Work

Dharanidharan - Feb 9

I Wrote a Script to Fix Audible's Unreadable PDF Filenames

snapsynapseverified - Apr 20

Your AI Agent Skills Have a Version Control Problem

snapsynapseverified - Apr 22

Same model. Different results. — AgentKit Benchmark + OpenCode Integration

Ajay_dev - Apr 12

I spent years trying to get AI agents to collaborate. Then Opus 4.6 and Codex 5.3 wrote the rules

snapsynapseverified - Apr 20
chevron_left

Related Jobs

View all jobs →

Commenters (This Week)

3 comments
2 comments
1 comment

Contribute meaningful comments to climb the leaderboard and earn badges!