Love the local first approach. Have you noticed users caring more about privacy or performance?
I Built a Local-First AI Desktop Knowledge Base — Here's What I Learned
16 Comments
@[Next Big Creative] Great question — honestly, it's been both, but for different user profiles. The privacy-first crowd comes in knowing exactly what they want: zero cloud, no telemetry, full control. They're usually handling legal docs, research notes, or internal company data. Performance is secondary for them — they'll wait 2 seconds for a query if they know the data never left their machine.
The performance-first users discover the privacy benefits after the fact. They come for the ~1ms FTS5 latency or the offline Ollama setup, and then realise cloud RAG tools were slowing them down AND sending their data somewhere. That's actually been the more interesting conversion — people who didn't start out caring about privacy, and now do.
So the short answer: privacy brings them in, performance makes them stay.
Please log in to add a comment.
Gunjan, this is a phenomenal write-up and an incredible masterclass in local-first systems architecture.
What I love most about Knovex is that it completely rejects the 'Digital Attic' trap. Most desktop AI tools just blindly dump messy, uncurated markdown and PDF snippets into an embeddings base and pray that semantic search can figure it out at runtime. All that does is saddle the user with a massive, recurring Prose Tax—burning local compute and ballooning context windows just to re-explain structural history the system should already know.
Your 6-stage normalization pipeline in docnest-ai is the exact right antidote. By investing in structure, section assignment, and table normalization at the ingestion boundary, you've built a true Forensic Ingestor. Pre-paying that precision so that L0 and L1 can resolve 70% of queries at zero token cost is pure engineering maturity.
Did you hit any specific edge cases when normalizing highly irregular table structures into that clean JSON schema before embedding them? That's usually where the deterministic layer gets tested the hardest.
Ken, this genuinely made my week. You've articulated the thesis better than I did — "knowing exactly when not to call an LLM" is the whole bet. The Observer's Tax framing came straight out of watching token bills balloon on queries that were really just "sum this column."
Since you're clearly tuned into this: I split the deterministic extraction layer out as its own open-source engine — DocNest (https://github.com/tailorgunjan93/docnest). Knovex is the desktop app on top, but I pulled the engine apart on purpose so the table-extraction / §-section / zero-token factual path could be reused and inspected independently. Would genuinely value your eyes on where the deterministic/LLM boundary should sit — you're exactly the person I'd want poking holes in it.
@[Gunjan Tailor] huge congratulations on pulling DocNest out as its own open-source engine! That is a massive contribution to the local-first community. Decoupling the deterministic extraction layer from the UI application layer is exactly how we move past monolithic, opaque AI tooling and start building verifiable data pipelines.
I would absolutely love to take a deep dive into the repository and poke at the architecture.
Regarding your question on where that deterministic/LLM boundary should sit, my immediate instinct leans toward a strict Gated Egress model. If we treat the deterministic layer as the absolute authority, the boundary should be defined by three clean criteria:
Factual and Aggregative Primacy: If a query can be resolved by a structured relational lookup, an AST structural path, or a mathematical operation (like your table column summation), the LLM should never be invoked. The deterministic code layer executes, returns the clean snippet, and completely bypasses the model. This is your zero-token fast path.
The LLM as an Ephemeral Narrator: The only time the boundary shifts to the probabilistic layer is when the user explicitly requests synthesis, semantic cross-referencing, or natural language translation of the data. Even then, the LLM is only handed the evidence bundle that your deterministic code layer has already extracted, validated, and frozen.
The Lineage Gate: The moment data crosses from your deterministic engine over to the LLM narrator, a non-repudiable audit trace should be generated. The engine should bind the raw data hash, the extraction metadata, and the exact slice passed to the model into a single receipt so that downstream drift can always be debugged.
I'm going to pull down the docnest repo tonight and look at how you're handling the structural extraction states under the hood. You've built an incredible foundation here. Let's map out exactly where the code ends and the narrator begins.
@[Ken W. Alger] The Gated Egress framing maps almost exactly onto what I landed on. L0 (FTS5 keyword) and L1 (precomputed table extraction) are your Factual Primacy layer — if either resolves the query above confidence threshold, the LLM never fires. L2 (ANN + LLM) only triggers when L0/L1 return below-confidence or the query is clearly synthesizing across sections. The Lineage Gate piece is where I'd love your eyes most — each answer currently carries its section_id chain but I haven't formalized a confidence gate on that leg yet. Open an issue on the DocNest repo with your three criteria and let's stress-test the boundary together.
Please log in to add a comment.
Thanks Hussein — that's exactly the bar I'm holding it to. "Privacy-preserving" only counts if it survives real use, so the rule is simple: nothing leaves the machine unless you flip a switch, and even then you see exactly what's sent. The hard part isn't the promise, it's keeping it true as features grow — but that constraint is what keeps the design honest. More to come.
Please log in to add a comment.
Please log in to comment on this post.
More Posts
- © 2026 Coder Legion
- Feedback / Bug
- Privacy
- About Us
- Contacts
- Premium Subscription
- Terms of Service
- Refund
- Early Builders
More From Gunjan Tailor
Related Jobs
- Desktop Support Associate-HelpdeskNTT DATA, Inc. · Full time · Mexico
- Data Engineer: AI, RAG & Knowledge BaseUNGUESS · Full time · Italian Republic
- Sr IT Infrastructure Engineer - Enterprise Database Platform Services - RemotePrime Therapeutics · Full time · Springfield, IL
Commenters (This Week)
Contribute meaningful comments to climb the leaderboard and earn badges!