|
|
|
|
|
by float-trip
924 days ago
|
|
Thanks for writing up. Rather than zeroing out the loss for the prompt, did you also try using weighted loss with Axolotl? At one point, Microsoft's GPT 3 docs suggested this was beneficial when the responses are short (like you have with "Cut in.") Domain adaptation over subreddits/forums before finetuning may help as well. |
|
This is really smart, I didn't think about this! Will add it to my list of things to try, great idea!
> Domain adaptation over subreddits/forums before finetuning may help as well.
I was thinking about this too (along with transcribing draft youtube videos), I'd definitely be curious how much this helps.