Y
Hacker News
new
|
ask
|
show
|
jobs
by
junipertea
1248 days ago
They also did reinforcement learning on top of a frozen trained model. It is considerably more than just attaching a UI as that would just finish sentences compared to answering questions.
https://huggingface.co/blog/rlhf