|
|
|
|
|
by factorymoo
780 days ago
|
|
This might be an unfair statement but it really feels like all of these blogs don't know why. They copy/paste each other (you often seem the same errors in multiple notebooks/blogs) and I have a feeling no one really deeply understands what they're doing. |
|
Spoiler alert, fine-tunes won't be better until the data quality is better than meta's instruction fine-tune. Give it some weeks.
Why does [doplin-l3-8B] perform substantially worse in some tests?
Essentially, it's trained like this:
And not like this: https://huggingface.co/cognitivecomputations/dolphin-2.9-lla...