| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by ilaksh 895 days ago

For training from scratch, maybe a small model like https://github.com/karpathy/nanoGPT or tinyllama. Perhaps with quantization.

Fine-tuning is very doable. The hard part is making a novel dataset with input output pairs. You might consider just combining datasets you find on HuggingFace as an experiment.

replicate.com has a dead simple fine tuning API.

Predibase is also an easy to use option. But again for something custom you need a dataset with hundreds of examples. Normally people use GPT-4 to generate the dataset. As long as OpenAI doesn't block them.