Your RAG isn't broken. Your chunks are.

Leader 1 6 31
calendar_todayschedule1 min read

Everyone wants to blame the model. "GPT-4 is hallucinating." "Claude isn't following instructions." "The embeddings are bad."Nine times out of ten, it's the chunks.I've been auditing RAG pipelines lately and the same patterns keep showing up:Fixed-size chunking that splits sentences in half. The model gets a fragment that starts with "...and therefore the policy applies only when" and you wonder why the answer is confidently wrong.No overlap between chunks. Context that spans a paragraph boundary is gone. The model sees the question but not the answer, because the answer lives in the chunk after the one that matched.Chunking before cleaning. Headers, nav menus, footer junk, and "Click here to subscribe" all get embedded as if they're content. Then they show up in retrieval and pollute the context window.Embedding the wrong thing. People embed the raw chunk but the user query is phrased completely differently. No wonder cosine similarity is low — they're not semantically close, they just happen to be about the same topic.Before you blame the model, look at what you're actually retrieving. Log the top 5 chunks for 10 real user queries and read them like a human. If you can't answer the question from those chunks, the model can't either.The model is a mirror. It reflects the quality of what you feed it.

2.4k Points38 Badges1 6 31
Development Teamtopstar-ai.github.io
18Posts
42Comments
47Followers
46Connections
AI Automation and Agents Developer building intelligent systems and modern web applications. Passionate about machine learning, deep learning, and creating next-gen autonomous agents.
Build your own developer journey
Track progress. Share learning. Stay consistent.
🔥 Join developers growing publicly
Share your knowledge, build in public, and grow your developer presence with a global community.

More Posts

Your Tech Stack Isn’t Your Ceiling. Your Story Is

Karol Modelskiverified - Apr 9

Your AI Doesn't Just Write Tests. It Runs Them Too.

Kevin Martinez - May 12

I’m a Senior Dev and I’ve Forgotten How to Think Without a Prompt

Karol Modelskiverified - Mar 19

Architecting a Local-First Hybrid RAG for Finance

Pocket Portfolio - Feb 25

The Sovereign Vault — A Comprehensive Guide to Protocol-Driven AI

Ken W. Algerverified - Jun 4
chevron_left

Related Jobs

View all jobs →

Commenters (This Week)

1 comment
1 comment
1 comment

Contribute meaningful comments to climb the leaderboard and earn badges!