| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by superkuh 605 days ago
	You're not wrong, but technically llama.cpp does have training (both raw model and fine tuning). And it's been around for a long time. Back around the ggml->gguf switch I used llama.cpp to train a tiny 0.9B llama 1 through the early fast parts of the loss reduction on 3GB of IRC logs with 64 tokens of context over about a month. It eventually produced some gpt2-like IRC lines within it's very short context. Would anyone choose llama.cpp's training tools to do serious work? No. Do they exist and work, yes.