Hacker News new | ask | show | jobs
by int_19h 1183 days ago
That bit about "fine-tuned on the Alpaca dataset" is precisely about that. But, yeah, no RLHF so far, although some people are already working on that.