Nextcloud ↔ Django (DRF) ↔ Redis-backed distributed system
Introduction
There’s a particular class of bugs that doesn’t live in your code, your database, or your API.
It lives in the space between them.
Today’s issue looked simple on the surface:
“Unable to parse farm schema response.”
But the root cause wasn’t in the UI, nor in the API.
It was in the cache.
The System Setup
The architecture is fairly typical for a modern distributed backend:
- Nextcloud App (UI + PHP backend)
- Django REST Framework (DRF) API (
/api/v1/...)
- Redis (used for caching + token storage)
- NDVI service feeding farm-state data
Redis had recently replaced APCu as the caching backend to support shared worker state and token reuse.
The Symptom
The Nextcloud admin UI failed to load farm schema with:
“Unable to parse farm schema response”
This error implied one thing:
- The frontend expected valid JSON
- It received something else
The Misleading Signals
Initial checks showed everything was “correct”:
✅ Schema endpoint URL was valid
/apps/weather_apis/api/v1/admin/farms/schema
✅ Route existed and matched
occ router:match confirmed it
✅ Redis was healthy
redis-cli ping → OK
- PHP connection as
www-data → OK
✅ Backend logic was intact
Yet the UI still failed.
The Trap
At this point, most debugging paths would focus on:
- fixing the controller
- adjusting serializers
- checking permissions
But none of those were the problem.
The Reality
The issue was cache-induced inconsistency:
- Redis had stored a stale response
- The backend schema had evolved
- The cached payload no longer matched the expected structure
So the system behaved like this:
| Layer | State |
| Route | Correct |
| Backend | Correct |
| Cache | Outdated |
| UI | Broken |
The UI wasn’t wrong.
The backend wasn’t wrong.
The cache was lying.
The Fix
The resolution was simple:
- Clear Redis cache keys
- Reload the UI
Immediately:
- Schema endpoint returned valid JSON
- UI parsing succeeded
- Farms admin interface recovered
The Deeper Insight
Caches don’t just store data — they store assumptions.
When your system changes:
- API shapes evolve
- serializers update
- authentication rules shift
But your cache:
- continues serving old truths
Why This Matters
This issue highlights a key principle in distributed systems:
A system can be locally correct but globally inconsistent
Every component was working as designed.
But together, they produced failure.
Extending the System: Authentication Alignment
Alongside the fix, the farm state endpoint was improved:
Tests were expanded to cover:
- API key access
- integration token flows
- proper rejection behavior (
403 for unauthenticated)
Result:
- 7/7 tests passing
- Consistent and secure API surface
Practical Takeaways
1. Always Suspect Cache
If you see:
- incorrect response shapes
- parsing errors
- inconsistent behavior across layers
Check the cache first.
2. Introduce Cache Versioning
Avoid collisions between old and new schemas:
farms:schema:v1
farms:schema:v2
3. Use TTLs for Dynamic Data
Schema and metadata endpoints should not live forever in cache.
4. Add Observability
Track:
- cache HIT / MISS
- response origin (cache vs source)
5. Design Defensive APIs
Ensure endpoints:
- always return valid JSON
- even in failure scenarios
Final Thought
This wasn’t a UI bug.
It wasn’t an API bug.
It wasn’t even a Redis bug.
It was a system truth mismatch.
And those are the bugs that matter most—because solving them means you’re no longer just writing code.
You’re understanding the system.
“In distributed systems, the hardest bugs aren’t where things break.
They’re where things appear to work.”