Hacker News new | ask | show | jobs
by mk67 1120 days ago
Seems interesting as it runs counter to the "common knowledge" that fine-tuning large LMs needs a lot of data and RLHF for good results.

Not that the absolute results are extremely strong, most likely I'd suspect as the base model is just not competitive to GPT4 atm, but the relative results seem very impactful. Maybe fine-tuning a large LM for specific tasks is more practical than thought before?

1 comments

In human learning at least, you need a good teacher that can give you a self consistent and correct basis and then you build on that. If you learn randomly your understanding will be "blurry" and then you have to spend time to unlearn the bad lessons. I have personal experience with this. I definitely like the message of this result, if it is true.