| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by SparkyMcUnicorn 1024 days ago
	The community has fine-tuned some really good llama models (much better than llama-chat), but I get what you're saying. I've been testing the best performing models on the huggingface leaderboard lately. Some of them are really impressive, and others are so bad that I second guess the prompt format or if the benchmarked model is actually the same one I'm testing.

1 comments

breadsniffer01 1024 days ago

Which models were really bad?

link

SparkyMcUnicorn 1024 days ago

I was keeping track of the good ones, and don't have many notes on the bad ones.

I do remember testing "LoKuS" last week and it was quite terrible (sometimes gave completely off-topic answers). It scored as one of the highest 13B models on the leaderboard (~65 average), but appears to be removed now.

link