Hacker News new | ask | show | jobs
by rajansaini 1173 days ago
Read the InstructGPT paper and see alpaca. You just need instruction fine-tuning.

Unlike what the other commenters are saying, RLHF, while powerful, isn't the only way to get an LLM to follow instructions.