Breaking the Language Barrier in Healthcare—One Whisper at a Time

posted 3 min read

Ever sat in a Bangladeshi clinic and heard someone say something like:

“Apni aagei bolen chilen na je chest-e ekta tightness feel hocche?”

That right there is code-switching—slipping between Bengali and English mid-sentence, like it’s no big deal. It happens all the time, especially in doctor-patient conversations. But while humans handle this effortlessly, most language models? They get confused faster than a junior doctor on their first ER shift.

And in healthcare, that’s a problem. You can’t afford to misinterpret symptoms because a model panicked at the word "tightness" sandwiched between two Bengali clauses.

So I decided to do something about it.


Enter: MediBeng Whisper Tiny

I built a small but mighty tool: a fine-tuned version of OpenAI’s Whisper Tiny model trained specifically to handle Bengali-English code-switched medical speech and translate it into clean English.

Yep, that means this model can listen to real mixed-language clinical convos and output clear, usable English—perfect for documentation, analysis, or even aiding with diagnosis.


Why I Did This

In real-world clinics across Bangladesh (and frankly, anywhere multilinguals exist), people constantly mix languages. It’s second nature. But no existing ASR model was cutting it when it came to understanding this hybrid speech, let alone translating it. The goal was simple:

Input: Bengali-English speech
Output: Clear English transcription

And for this? I had to train Whisper to stop freaking out every time it heard a Bengali sentence ending with an English medical term.


How I Did It (a.k.a. The Nerdy Bit)

  1. Built a Custom Dataset – MediBeng

I couldn’t find a dataset that captured the quirky, authentic blend of Bengali and English that you hear in clinics. So I built my own: MediBeng.

It’s a synthetic but realistic dataset of code-switched speech based on how real people actually talk in clinical settings.


  1. Fine-Tuned Whisper Tiny

I used MediBeng to fine-tune Whisper Tiny, teaching it how to make sense of bilingual speech. That means adjusting the model's brain (aka weights) to understand when someone says:

“Take korar por ekdom bhalo feel kori nai.”

...they probably meant something worth documenting.


✅ 3. Tested It on Real Conversations

After fine-tuning, I ran the model on real examples—and the improvement was obvious. It could now transcribe and translate code-switched speech much more accurately, giving output that was actually usable for patient records.


Why This Actually Matters

This isn’t just a fun project or an academic exercise—it solves a real problem. Healthcare professionals are already overwhelmed. Asking them to decode bilingual gibberish in transcripts just makes their job harder.

With MediBeng Whisper Tiny, we’re talking:

Fewer transcription errors

Better documentation

Faster decision-making

Happier doctors (and probably patients too)


The Big Picture

This project proves that even lightweight models like Whisper Tiny can be smart—if you give them the right data. It also shows the massive potential of domain-specific, culturally aware AI tools.

Imagine tailoring models like this for other regions: Hindi-English, Arabic-French, Spanish-Spanglish—you get the idea.

I started with Bengali-English medical speech because that’s where the pain was. But this blueprint? It’s open-source, replicable, and ready to be extended by anyone who wants to build smarter speech models for multilingual communities.

Check out the repo to dive into the code or replicate this yourself:

MediBeng Whisper Tiny Model

MediBeng Dataset

GitHub Repository

TL;DR

People code-switch. A lot. Especially in clinics.

ASR models weren’t handling it well.

So I trained Whisper Tiny on Bengali-English medical speech.

It now translates mixed speech into clean English.

It’s free and open for you to build on.

If you’re working in multilingual healthcare tech—or just love solving messy real-world AI problems—this one’s for you.

Let’s make speech models speak our language(s).

If you read this far, tweet to the author to show them you care. Tweet a Thanks

Really impressive work—addressing code-switching in healthcare speech is such a crucial yet overlooked challenge. How well does MediBeng Whisper Tiny handle less common dialects or slang within Bengali-English conversations?

MediBeng Whisper Tiny does a decent job with common Bengali-English mix, especially when the English words are clear and standard. But for less common dialects or local slang, it still struggles a bit. It's better than the base Whisper Tiny in handling those, but not perfect yet. There’s definitely room to improve its ear for regional flavor.

Thanks for your response. H2 H3 headings and some bold etc formatting would make readers happy..:-)

Thanks for your feedback. I'll keep in mind :)

More Posts

The Transformative Impact of AI in Gamified Formats: Revolutionizing Healthcare Systems and Language

Maja OLAGUNJU 1 - Sep 7

When AI Speaks Your Language(s): Multilingual TTS for Smarter Healthcare

Promila Ghosh - Jul 26

An introduction to moving from just an everyday AI user to developing AI solutions specific for you

Samuel Ekirigwe - Jan 14

AI in Healthcare: How LLMs are Transforming Medical Documentation and Decision Making

Aun Raza - Sep 4

Learning the Single Responsibility Principle, One Test at a Time

Waffeu Rayn - Aug 26
chevron_left