A new paper from Stanford Tran & Kiela, arXiv 2604.02460 tested single-agent vs multi-agent systems with identical thinking-token budgets — and the multi-agent advantage disappears.
The hidden variable
Every "multi-agent wins" benchmark you've read...
Oxford Economics published the receipt: AI-related job cuts accounted for only 4.5% of US layoffs in the first 11 months of 2025. The remaining 245,000 layoffs in the same period were driven by ordinary market and economic conditions — nearly four ti...
On April 9, 2026, Microsoft Research published the New Future of Work Report. Buried inside is a section titled "myths" — three of them, named directly:
1. Counting AI-generated lines of code is a meaningful productivity metric
2. Current tools will...
I spent 33 minutes running an experiment where I accepted 10 Claude Code diffs in a row without reading any of them. Test coverage went from 90% to 0%. The codebase literally stopped importing.
After that wake-up call, I built a constraint system th...
If you're a senior engineer running multiple AI agents and feeling wrecked at the end of every day — you're not slow, and you're not falling behind. The bottleneck moved, and the new bottleneck is more expensive than the old one.
Writing used to be ...
Most devs still use Claude like a search engine — type, read, copy, close.
But the 2026 Pragmatic Engineer survey shows 55% of engineers now use AI
agents, and senior engineers lead at 63.5%. Same Claude, different mode.
Just published a breakdown...
Google shipped Gemini CLI last month. Free tier, open source,
1,000 requests/day. No credit card required.
I compared it against Claude Code on real codebases for two weeks.
Here's what the data shows:
Where Gemini CLI wins:
1M token context win...
Vibe coding is how juniors ship bugs fast.
You describe a feature in natural language, the AI generates code,
you tweak until it works. Fast? Yes. Scalable? No.
At scale, vibe coding gives you:
500 lines of unreviewable code
Features you didn't ...