Y
Hacker News
new
|
ask
|
show
|
jobs
by
fzimmermann89
289 days ago
Also, for an auto complete I think a small llm trained from scratch should already work well. Have you tried on if the tinystories(also only 3gb..)/nanogpt speed runs without any fancy loss terms etc as a baseline?