namakoo

@namakoo

池澤允彦

Building IDFU — verified-code DPO training data for LLM fine-tuning
640 Points6 Badges1 Connections1 Followers1 Following

About

Solo developer with 10+ years of
experience in Android development and the broader
developer ecosystem. Currently building IDFU on
Hugging Face — a verified-code DPO training da... Show more

Language & Tools

Python, Docker, PyTorch, Transformers, TRL DPOTrainer,
Qwen2.5-Coder, HuggingFace Hub, ChromaDB, SQLite,
Stripe, Android, LoRA, Quantization

Currently Exploring

DPO training data curation, LLM fine-tuning at 4-bit
quantization, on-device LLM agents, Python failure
pattern taxonomy

Achievements

+3.46 ± 0.35 pp HumanEval improvement (3 seeds, Qwen2.5-Coder-3B) using 500 curated DPO pairs.
Built and shipped IDFU — a verified-code DPO dataset family — as a solo developer, with 8 products across 5 price tiers on Hugging Face.
Featured contributor on CoderLegion.

Fun Fact

Started in custom ROM development a decade ago, ended up curating failure patterns for code LLMs. The throughline is the same: figuring out exactly where things break.

Random Dev Quote

Most of the work in DPO training data is on the rejected side.
Joined: 2 days (since May 2)
Extra privileges: Editing any comment
Full Name: 池澤允彦
Headline: Building IDFU — verified-code DPO training data for LLM fine-tuning
About: Solo developer with 10+ years of
experience in Android development and the broader
developer ecosystem. Currently building IDFU on
Hugging Face — a verified-code DPO training data
family covering specialty packs ($9), main releases
($49), and benchmark-aligned DPO Pair Pack ($99).

Recent benchmark: +3.46 ± 0.35 pp on HumanEval
(Qwen2.5-Coder-3B, 3 seeds, Docker sandbox executed).

Datasets: huggingface.co/datasets/namakoo/idfu-verified-code
dev.to: dev.to/namakoo
Location: Japan
Website: https://huggingface.co/datasets/namakoo/idfu-verified-code
Languges & Tools: Python, Docker, PyTorch, Transformers, TRL DPOTrainer,
Qwen2.5-Coder, HuggingFace Hub, ChromaDB, SQLite,
Stripe, Android, LoRA, Quantization
Currently Exploring: DPO training data curation, LLM fine-tuning at 4-bit
quantization, on-device LLM agents, Python failure
pattern taxonomy
Achievements: +3.46 ± 0.35 pp HumanEval improvement (3 seeds, Qwen2.5-Coder-3B) using 500 curated DPO pairs.
Built and shipped IDFU — a verified-code DPO dataset family — as a solo developer, with 8 products across 5 price tiers on Hugging Face.
Featured contributor on CoderLegion.
Fun Fact: Started in custom ROM development a decade ago, ended up curating failure patterns for code LLMs. The throughline is the same: figuring out exactly where things break.
Random Dev Quote: Most of the work in DPO training data is on the rejected side.
Top Tags

Top Tags

Activity

User Activities

JanFebMarAprMayJunJulAugSepOctNovDec
Mon
Tue
Wed
Thu
Fri
Sat
Sun
Less More
Contribution
chevron_left

Latest Jobs

View all jobs →

Active in these Groups:

AI
856 members
Open Source
815 members
Python Dev
439 members