Hacker News new | ask | show | jobs
by kashifr 1177 days ago
All the steps involved in training a LlaMa model to answer questions on Stack Exchange data with RLHF.
1 comments

You could of course use your own question and answer data to refine the model using the same process. I wonder if anyone has tried that yet to, for instance, fine tune LlaMa to answer support queries for their company?