LLMs don't throw exceptions when they hallucinate. They return 200 OK with a confident wrong answer.
That's the problem I set out to fix. Here's how I built Failure Intelligence Engine (FIE) — a real-time observability layer that detects, classifies, and explains LLM failures before your users notice them.
Why Standard Monitoring Isn't Enough
Latency dashboards, error rates, uptime checks — none of these tell you if your model just hallucinated a drug interaction or contradicted itself three times in a row.
LLMs fail silently. You need a layer that understands what the model said, not just whether it responded.
The 4-Stage Pipeline
Your App → [Prompt Guard] → [Signal Extraction] → [DiagnosticJury] → Dashboard
Stack: FastAPI + LangGraph + MongoDB + Groq + React
Stage 1 — Prompt Guard
9 detection layers (regex + pattern matching, zero LLM calls) screen every incoming prompt for injection attacks, jailbreaks, and role manipulation. Score ≥ 0.75 = immediate block. Fast enough to run on every request.
Stage 2 — Failure Signal Vector
For prompts that pass the guard, FIE samples the model multiple times and builds a signal vector:
failure_signal = {
"entropy": 0.82, # how random are the outputs?
"agreement_score": 0.31, # do outputs semantically agree?
"high_failure_risk": True,
}
High entropy + low agreement = the model is guessing.
Stage 3 — Archetype Classification
Every inference gets a human-readable failure label:
- STABLE — Confident and consistent
- HALLUCINATION_RISK — Entropy ≥ 0.75
- UNSTABLE_OUTPUT — Outputs contradict each other
- LOW_CONFIDENCE — Model hedges consistently
- MODEL_BLIND_SPOT — Adversarial prompt blocked
Stage 4 — DiagnosticJury
Three specialized agents deliberate on every flagged failure:
- AdversarialSpecialist — was this a prompt attack?
- LinguisticAuditor — any factual gaps or contradictions?
- KnowledgeAuditor — does the output contradict known facts?
Their verdicts aggregate into a jury_confidence score and a plain-English failure_summary.
Wired Together with LangGraph
g.add_conditional_edges(
"prompt_guard",
route_after_guard,
{"block": "block", "signal_extract": "signal_extract"},
)
g.add_edge("signal_extract", "jury_deliberate")
Blocked prompts short-circuit to END instantly. Safe prompts run the full pipeline. Typed state makes every node's contract explicit and the whole thing debuggable.
Monitor Any LLM in 3 Lines
from fie_sdk import FIEClient
client = FIEClient(api_key="fie-your-key")
result = client.monitor(
prompt="What is the capital of France?",
model_outputs=["Paris", "Paris", "Lyon"],
)
print(result.archetype) # UNSTABLE_OUTPUT
print(result.failure_summary) # "Inconsistent outputs — 2/3 agree"
What I Learned
- Watch inputs, not just outputs — the guard layer catches most failures for free
- Single metrics lie, vectors don't — entropy alone is noisy; combine it with agreement and drift
- Multi-agent deliberation catches edge cases — jury disagreement is itself a signal
GitHub: github.com/AyushSingh110/Failure_Intelligence_System
Live: failure-intelligence-system.pages.dev
Install: pip install fie-sdk
Ayush Singh — building open-source LLM infrastructure. GitHub · dev.to