Multimodal AI has a hidden problem.

Leader 2 57 118
calendar_todayschedule1 min read

Multimodal AI has a hidden problem.

Images → one tokenizer

Videos → another

3D → completely different setup

And it gets worse:

  • Models that generate visuals don’t really understand them

  • Models that understand visuals can’t generate them well

So instead of one intelligent system,

we end up with a stack of disconnected capabilities.

Apple is trying to take a very different approach with new model - AToken

Instead of adding more pieces, it removes them.

  • One tokenizer

  • One encoder

  • Works across images, videos, and 3D

The core idea:

Treat all visual data in a unified format.

Images → (x, y)

Videos → (t, x, y)

3D → (x, y, z)

Everything becomes part of a single 4D token space.

So the same model can:

  • Understand

  • Generate

  • Reconstruct

Across all formats.

And the real unlock:

Data leverage.

We have massive image datasets.

But very limited video and 3D data.

With a shared model:

→ Learning transfers across modalities

→ Less data needed overall

→ Faster capability growth

This is exactly what happened with LLMs.

One tokenizer → text, code, conversations, everything.

Now we’re seeing the same shift in vision.

From:

“different models for different media”

To:

one model that understands the visual world.

1 Comment

1 vote
🔥 Join developers growing publicly
Share your knowledge, build in public, and grow your developer presence with a global community.

More Posts

TypeScript Complexity Has Finally Reached the Point of Total Absurdity

Karol Modelskiverified - Apr 23

The Sovereign Vault — A Comprehensive Guide to Protocol-Driven AI

Ken W. Algerverified - Jun 4

AI Agents Don't Have Identities. That's Everyone's Problem.

Tom Smithverified - Mar 13

Why Are There Only 13 DNS Root Servers For The Whole World? Is that a problem

richarddjarbeng - May 7

Your AI Agent Skills Have a Version Control Problem

snapsynapseverified - Apr 22
chevron_left
7.2k Points177 Badges
Indiaaimletc.com
71Posts
47Comments
10Connections
Nikhilesh is an entrepreneur, teacher and tech nerd
He is an IIT Kharagpur alumnus. He is also a Goo... Show more

Related Jobs

View all jobs →

Commenters (This Week)

2 comments
1 comment
1 comment

Contribute meaningful comments to climb the leaderboard and earn badges!