Posts by namakoo

@namakoo

池澤允彦

Building IDFU — verified-code DPO training data for LLM fine-tuning
1.1k Points8 Badges1 Connections1 Followers1 Following

Posts by namakoo

namakoo in Articles 11 min read
Over 36 hours we ran four DPO training iterations against Qwen2.5-Coder-7B-Instruct, trying to push HumanEval pass@1 above the base model's 87.20%. The first three iterations failed in different ways -9.15pp, -1.22pp, two NO-GO calls. The fourth rec...
namakoo in Articles 6 min read
In the previous posthttps://dev.to/namakoo/-curating-python-failures-for-dpo-notes-from-the-rejected-side-2ff0, I described the curation philosophy for IDFU's rejected-side dataset — why I avoid synthetic bug generation, why stub detection matters, w...
chevron_left

Latest Jobs

View all jobs →