|
|
|
|
|
by henry_pulver
1055 days ago
|
|
Not seen a great explainer on this yet. You'd either need access to the model weights or a fine-tuning API. Then depending on which fine-tuning approach you want to use, the user data you need to collect will be different: RLHF requires multiple outputs to a single query vs instruction fine-tuning where you need great input-output pairs to train on. You could ask the user's feedback after running the LLM to pick out good training data. |
|