Hacker News new | ask | show | jobs
by cold_harbor 24 days ago
LoRA won't fix the tokenization problem. Norwegian on a typical English-heavy BPE vocab uses 1.5-2x more tokens per word — that compounds into real inference cost, not just quality