Hacker News new | ask | show | jobs
by superkuh 559 days ago
You're not wrong, but technically llama.cpp does have training (both raw model and fine tuning). And it's been around for a long time. Back around the ggml->gguf switch I used llama.cpp to train a tiny 0.9B llama 1 through the early fast parts of the loss reduction on 3GB of IRC logs with 64 tokens of context over about a month. It eventually produced some gpt2-like IRC lines within it's very short context.

Would anyone choose llama.cpp's training tools to do serious work? No. Do they exist and work, yes.