What I Learned Building a GDPR Compliance Tool (And What I'd Do Differently)

Question

What I Learned Building a GDPR Compliance Tool (And What I'd Do Differently)

calendar_todayJun 8 • schedule5 min read

A few months ago I had an idea that seemed simple enough. Build a tool that scans a website, checks it against the key GDPR requirements, and spits out a report. How hard could it be?

Turns out, pretty hard. Not because the individual pieces were complicated, but because every decision you make early on has a way of coming back to bite you later. This is an honest account of how ClearlyCompliant came together, the wrong turns I took, and what I'd change if I started over today.

The Idea

I kept noticing the same thing on small business websites. Cookie banners that didn't actually block cookies. Contact forms with no privacy policy link. Privacy policies copied from a template that mentioned GDPR once and said nothing useful.

The businesses weren't ignoring compliance out of malice. They just had no easy way to know what they were getting wrong. Enterprise compliance tools cost hundreds per month. Legal consultants cost more. Most small businesses either crossed their fingers or paid someone to tell them they were probably fine.

There was a gap there. A simple, affordable scan that told you specifically what was wrong and why it mattered. One-off fee, no subscription, no jargon.

So I built it.

The First Mistake: Overcomplicating the Architecture

My first instinct was to do this properly. Task queue, worker processes, the works. I spent the better part of a week getting Celery and Redis set up before I stopped and asked myself whether I actually needed any of it.

The answer was no. For the scale I was targeting, Python's built-in threading module was completely sufficient. One less service to run, one less thing to monitor, one less thing to break at 2am.

The lesson I keep relearning: start with the simplest thing that works and add complexity only when you have a specific problem that requires it. Celery is a great tool. I didn't need it.

The Second Mistake: Underestimating PDF Generation

I assumed PDF generation would be the easy part. Grab WeasyPrint, write some HTML and CSS, done.

WeasyPrint is genuinely great at producing well-designed PDFs from HTML. The problem is it has a GTK dependency that made my Windows development environment a nightmare. Hours of debugging later I still couldn't get it working reliably.

I switched to ReportLab. It's lower level, you're building the document programmatically rather than styling HTML, and the learning curve is steeper. But it's pure Python, it installed in seconds, and it works the same everywhere. The reports look professional and I have precise control over every element.

If I started over I'd go straight to ReportLab. WeasyPrint would have been fine on Linux, but I wasn't developing on Linux and fighting your tooling is a tax on every hour you spend building.

The Part That Actually Worked: Using AI for Policy Analysis

Most of the GDPR checks are deterministic. Does a cookie banner exist? Is HTTPS enforced? Are there security headers? You're looking for specific things and either they're there or they aren't.

Privacy policy analysis is different. A privacy policy is a natural language document and the question isn't just whether one exists but whether it actually covers what it's supposed to cover. Does it mention retention periods? Does it explain the lawful basis for processing? Does it tell users how to make a complaint to the ICO?

You can't answer those questions with a regex.

I used the Claude API (Haiku model) to analyse the policy content against a structured prompt listing the required GDPR elements. It evaluates each one and returns a PRESENT, PARTIAL, or MISSING status with a one-sentence explanation. The results feed directly into the PDF report alongside the deterministic checks.

This was the part of the build I was least sure about going in and it ended up being one of the cleanest pieces of the whole system. The AI handles the ambiguity well and the structured prompt keeps the output consistent enough to parse reliably.

The Part I Got Wrong: Scope Creep Avoidance

Early on I decided the MVP would not include remediation guidance. The report would tell you what was wrong but not how to fix it. Ship first, add that later.

That was the right call for getting to market. But "later" has a way of becoming "never" when you're already moving onto the next thing. The most consistent piece of feedback from early users has been that they want to know how to fix the issues the report flags, not just what they are.

It's on the roadmap. But if I had my time again I'd have built at least a basic version of it into the initial release. The guidance doesn't need to be comprehensive. Even a short paragraph per finding explaining what action to take would have meaningfully improved the product from day one.

The Part Nobody Tells You About: Finding the Privacy Policy URL

This sounds trivial. It wasn't.

The first version asked users to paste in their privacy policy URL manually. Most didn't know it offhand, had to go find it, and some just left the form. Friction kills conversions.

So I built auto-detection. Crawl the page, look for links matching common privacy policy patterns in the href or link text, fall back to trying common paths like /privacy-policy or /privacy if nothing matches.

It works well for maybe 85% of sites. The edge cases are genuinely weird. Sites that host their privacy policy on a subdomain. Sites that use JavaScript to render the footer where the link lives. Sites that link to a third-party policy hosted on a completely different domain.

I keep improving it. But it was a reminder that "obvious" features often have long tails of edge cases that eat time you didn't budget for.

On Pricing

I went back and forth on this. Monthly subscription felt wrong for a compliance check that you might only need once or twice a year. Freemium felt like it would attract users who'd never convert. I settled on a one-off payment of £29.99.

It's early days but the conversion rate has been reasonable and support overhead is low because there are no ongoing billing issues to deal with. One-off pricing also removes the hesitation that comes with committing to yet another recurring charge.

I think it was the right call for this product. I might add a bulk option for agencies who want to run scans across multiple client sites, but the core model stays one-off.

What I'd Do Differently

Start with ReportLab, not WeasyPrint. Don't fight your tooling.

Build threading first, not Celery. Add infrastructure when you have a specific reason to, not because it's the "proper" way.

Include at least basic remediation guidance from day one. Users want to know what to do, not just what's wrong.

Spend more time on the privacy policy auto-detection before launch. The edge cases are frustrating enough that they show up in feedback more than any other issue.

Write the content marketing pieces earlier. The product was live for a couple of weeks before I started writing about it. Every week without content is a week without potential organic traffic.

Where It Is Now

ClearlyCompliant is live at clearlycompliant.co.uk. It runs 23 GDPR checks across cookie consent, privacy policy content, forms, security headers, and third-party scripts, analyses the policy with AI, and delivers a PDF report by email. The whole scan takes a few minutes.

If you're building something in a regulated space I'd be interested to hear how you've handled the compliance side of it. Drop a comment below.

🔥 Join developers growing publicly

Share your knowledge, build in public, and grow your developer presence with a global community.

Join CoderLegion

chevron_left

Joe Seabrook

1.4k Points • 31 Badges

9Posts

7Comments

3Connections

Hi, I’m Joe, a web developer and tech entrepreneur. I don’t just write code, I build projects that s... Show more

Commenters (This Week)

Contribute meaningful comments to climb the leaderboard and earn badges!

	Just completed another large-scale WordPress migration — and the client left this saqib_devmorph - Apr 7
	I’m a Senior Dev and I’ve Forgotten How to Think Without a Prompt Karol Modelskiverified - Mar 19
	How I Built a React Portfolio in 7 Days That Landed ₹1.2L in Freelance Work Dharanidharan - Feb 9
	The End of Data Export: Why the Cloud is a Compliance Trap Pocket Portfolio - Apr 6
	I Wrote a Script to Fix Audible's Unreadable PDF Filenames snapsynapseverified - Apr 20

What I Learned Building a GDPR Compliance Tool (And What I'd Do Differently)

The Idea

The First Mistake: Overcomplicating the Architecture

The Second Mistake: Underestimating PDF Generation

The Part That Actually Worked: Using AI for Policy Analysis

The Part I Got Wrong: Scope Creep Avoidance

The Part Nobody Tells You About: Finding the Privacy Policy URL

On Pricing

What I'd Do Differently

Where It Is Now

0 Comments

Please log in to comment on this post.

More Posts

Just completed another large-scale WordPress migration — and the client left this

I’m a Senior Dev and I’ve Forgotten How to Think Without a Prompt

How I Built a React Portfolio in 7 Days That Landed ₹1.2L in Freelance Work

The End of Data Export: Why the Cloud is a Compliance Trap

I Wrote a Script to Fix Audible's Unreadable PDF Filenames

More From JayCode

Why Cookie Consent is Broken on Most UK Business Websites (And How to Fix It)

GDPR Compliance Checklist for Web Developers in 2026

I Thought My Contact Form Was Working — It Wasn’t

Related Jobs

Commenters (This Week)

Welcome to Coder Legion

Connect with 4,718 amazing developers

Don't have an account? Sign up

OR

What I Learned Building a GDPR Compliance Tool (And What I'd Do Differently)

The Idea

The First Mistake: Overcomplicating the Architecture

The Second Mistake: Underestimating PDF Generation

The Part That Actually Worked: Using AI for Policy Analysis

The Part I Got Wrong: Scope Creep Avoidance

The Part Nobody Tells You About: Finding the Privacy Policy URL

On Pricing

What I'd Do Differently

Where It Is Now

0 Comments

Please log in to comment on this post.

More Posts

More From JayCode

Related Jobs

Commenters (This Week)