Hacker News new | ask | show | jobs
by unraveller 780 days ago
Found my answer for why thanks to the issues in latest dolphin fine-tune. They do these types of fine tunes mainly to reduce refusal rates and increase intelligence. They did the knee-jerk rerun of the same old data this time, as I suspected, just for lols to see where open-source is at.

Spoiler alert, fine-tunes won't be better until the data quality is better than meta's instruction fine-tune. Give it some weeks.

Why does [doplin-l3-8B] perform substantially worse in some tests?

Essentially, it's trained like this:

  LLama-3-8B-base_model --> LLama-3-8B-Instruct
  LLama-3-8B-base_model --> dolphin-2.9-llama3-8B
And not like this:

  LLama-3-8B-Instruct --> dolphin-2.9-llama3-8B
https://huggingface.co/cognitivecomputations/dolphin-2.9-lla...