Hacker News new | ask | show | jobs
by junipertea 1248 days ago
They also did reinforcement learning on top of a frozen trained model. It is considerably more than just attaching a UI as that would just finish sentences compared to answering questions. https://huggingface.co/blog/rlhf