The Problem
Most language models accidentally memorise private data - even if it appears only once.
A phone number buried in a blog post, a home address mentioned in a forum, a unique email ID in a dataset… once a model sees it, it can quietly store it.
And no amount of fine-tuning can reliably “erase” it later.
The Solution
Google introduces VaultGemma, the first open-weights language model designed from day one not to memorise personal information
What did they do differently?
Instead of applying differential privacy during fine-tuning (which is too late), the team trained the entire 1B-parameter model from scratch using differential privacy.
That means:
- Every training example has a strictly limited impact
- Unique examples become statistically invisible
The result?
Across 1 million tests, VaultGemma memorised zero training examples, while similar models did.
Performance lands around GPT-2 level - decent, considering the strong privacy guarantees.
Yes, there’s still work ahead (such as handling private information that appears multiple times), but this is a huge step toward safer, more trustworthy AI systems.
The future of responsible AI may not be the model that remembers everything,
but the one that remembers only what truly matters.