The numbers look great for CKG, but feels like a bit of an unfair comparison vs RAG GraphRAG
Did you try hybrid setups or strictly pure systems?
I benchmarked RAG vs GraphRAG vs pre-structured knowledge graphs across 45 domains — here's what happened
3 Comments
@[J.Bruni] Fair question — these are pure system comparisons, intentionally.
The benchmark is designed to isolate the variable: what does structure alone contribute, before you add any hybrid layer? If I tested CKG + RAG fallback against vanilla RAG, any gain could be attributed to the ensemble rather than the structure. Pure comparison is the only way to measure the signal cleanly.
Hybrid setups are a legitimate next step and probably the right production architecture for domains where the graph is incomplete. The benchmark gives you a baseline to know what you're trading away when you add retrieval back in — right now that tradeoff is 4× F1 and 11× tokens.
The methodology section of the paper covers this: https://github.com/Yarmoluk/ckg-benchmark/blob/main/paper/main.pdf
Happy to discuss specific hybrid configurations if you've tried them — curious whether retrieval fallback on low-confidence CKG traversals closes the gap on T1 (entity lookup), which is where CKG is weakest.
Please log in to add a comment.
Please log in to comment on this post.
More Posts
- © 2026 Coder Legion
- Feedback / Bug
- Privacy
- About Us
- Contacts
- Premium Subscription
- Terms of Service
- Refund
- Early Builders
More From Daniel Yarmoluk
Related Jobs
- Python developer with Storage domain experienceKeylent Inc · Full time · Houston, TX
- Senior Platform Engineer, Storagejobgether · Full time · Ireland
- Senior Technical Garage Bodily Injury Claims SpecialistArgonaut Management Services, Inc · Full time · Springfield, MO
Commenters (This Week)
Contribute meaningful comments to climb the leaderboard and earn badges!