Your CTO can't tell you how much AI code you're writing—and that's a bigger problem than you think.

BackerLeader posted 3 min read

Why Your CTO Can't Tell You How Much AI Code You're Actually Writing

Span's new AI code detector reveals the measurement gap that's keeping engineering leaders in the dark about their AI transformation.


Engineering leaders are flying blind. They're making million-dollar decisions about AI coding tools based on guesswork, self-reported surveys, and vendor claims that don't match reality.

This disconnect became clear when I spoke with J Zac Stein and Henry Liu, co-founders of Span, about their new AI code detector. The tool can identify AI-assisted versus human-written code with over 95% accuracy—and it's revealing some uncomfortable truths about how we measure AI adoption.

The Numbers Don't Add Up

Google claims 25% of new code comes from AI. Microsoft says 30%. But when engineering teams try to verify these numbers internally, they hit a wall. Current measurement methods count lines of code suggested by AI tools, but they don't track what happens next. Was that code modified before commit? Deleted entirely? No one knows.

"Engineering leaders are facing immense pressure to demonstrate the value of AI investments, but they're making decisions based on anecdotal evidence," Henry Liu told me. This pressure comes from boards and executives who want concrete ROI metrics, not developer surveys.

The problem runs deeper than bad data. Companies are operating over a dozen different AI coding tools simultaneously—Copilot, Cursor, Claude, and more. Each tool provides different telemetry. Some provide none at all. And none of them can track the significant amount of AI-generated code that comes from developers copying and pasting from ChatGPT.

Beyond the Hype

The biggest misconception executives have about AI coding? That faster code writing automatically equals higher productivity. This assumption ignores critical factors like code review burden, technical debt, and defects in production.

"Writing code is not the only thing engineers do," Liu explained. "The longer the company, the less writing code is a bottleneck." Senior engineers spend more time on architecture decisions, coordination, and ensuring teams work on the right problems. A 30% speed increase in coding doesn't translate to 30% overall productivity gains.

This reality creates tension between believers and skeptics within engineering teams. Leadership pushes top-down AI adoption while senior individual contributors remain cautious. Without solid data, these debates rely on anecdotal evidence and personal preferences.

How the Detection Works

Span's solution centers on span-detect-1, a machine learning model trained on millions of code samples with known provenance. The model analyzes patterns in token sequences, syntax quirks, and stylistic regularities to determine AI versus human authorship.

The technical challenge was significant. When reasoning models like ChatGPT improved, the team had to dramatically increase their training data—by 10x—to maintain accuracy. Many research papers in this space don't properly evaluate their models, leading to inflated accuracy claims that don't hold up in real-world conditions.

Currently supporting Python, TypeScript, and JavaScript, the detector works across all AI coding tools. This tool-agnostic approach helps companies understand their true AI usage regardless of which specific tools their developers prefer.

What Companies Are Learning

Early customers are discovering adoption patterns that challenge assumptions. While senior engineers tend to be more reserved about AI tools, the differences aren't as stark as expected. Teams working on greenfield projects adopt AI more readily than those maintaining legacy codebases, but overall adoption is climbing across all segments.

For companies that have fully embraced AI coding tools, weekly usage rates average around 70%. Some engineers are spending hundreds of dollars daily on token costs. As Liu noted, "Token spend is going to be comparable to payroll spend in the future."

This shift transforms every engineer into a resource allocator. Understanding which AI tools provide the best ROI becomes critical for both productivity and cost management—similar to how companies had to learn cloud cost optimization.

The Bigger Picture

Span's AI detector represents just one piece of a broader developer intelligence platform. The company is building what Liu and Stein call "the tool we wish we'd had in our previous roles"—a comprehensive suite for understanding engineering productivity in the AI era.

Their approach goes beyond simple metrics. Span uses AI to understand developer productivity rather than relying on outdated measures like lines of code. The platform includes qualitative surveys to identify developer headwinds and tailwinds, cost capitalization tools, and upcoming features to compare defect rates between AI and human code.

"This type of tool was not possible three years ago," Liu said. "Span helps you lead your engineering organization through AI transformation with data, not hype."

As AI coding tools continue evolving rapidly, the need for objective measurement becomes more critical. Companies can't optimize what they can't measure. Span's detector provides the ground truth that engineering leaders need to make informed decisions about their AI strategy.

The future of software development is clearly AI-augmented. The question isn't whether to adopt these tools—it's how to do it intelligently, with data instead of hope.

If you read this far, tweet to the author to show them you care. Tweet a Thanks
0 votes

More Posts

Your Prompts Are Basically Drunk Texts to AI (And That’s Why They Suck)

Yash - Sep 26

How to Stop Writing Prompts That Make AI Hate You

Yash - Sep 22

AI isn't replacing developers, it's turning them into system orchestrators and strategic thinkers.

Tom Smith - Jul 23

How to Craft Effective Prompts Using PARTS

mariohhd - Oct 13

If your workplace isn’t committed to being diverse and inclusive, then that's a problem.

Uchechi - Apr 20
chevron_left