What I Learned Building a Cybersecurity LLM From Scratch: 30,000 Steps, 3 Data Rewrites, and One Capacity Ceiling
I started building GhostLM thinking the hard part would be the transformer architecture. I was wrong. The architecture took one day. T...
I've been building GhostLM for the past few months — a decoder-only transformer trained entirely from scratch in PyTorch on cybersecurity data. No pretrained weights, no HuggingFace wrappers. Every component hand-written.
Phase 1 looked promising. Af...