| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by iliane5 1197 days ago

Agreed, there is way too much hype about the actual capabilities of the LLaMa models. However, instruction tuning alone makes Alpaca much more usable than the the base model and to be fair even some versions of the "tiny" 7B can do small talk relatively well.

> Using GPT to generate training data for fine-tuning seems to produce the best results, but even so, GPT4-x-Alpaca 30B is still clearly inferior to the real thing.

Distillation is interesting and it does seems to make the models adopt ChatGPT's style but I'm dubious that making LLMs generate entire datasets or copy/pasting ShareGPT is going to give you that great of a dataset. The whole point of RLHF is getting the human feedback to make the model better. OpenAI's dataset/RLHF work seems to be working wonders for them and will continue to give them a huge advantage (especially now that they're getting hundred of millions of conversations of people doing all sorts of things with ChatGPT)