Hacker News new | ask | show | jobs
by SparkyMcUnicorn 1024 days ago
The community has fine-tuned some really good llama models (much better than llama-chat), but I get what you're saying.

I've been testing the best performing models on the huggingface leaderboard lately. Some of them are really impressive, and others are so bad that I second guess the prompt format or if the benchmarked model is actually the same one I'm testing.

1 comments

Which models were really bad?
I was keeping track of the good ones, and don't have many notes on the bad ones.

I do remember testing "LoKuS" last week and it was quite terrible (sometimes gave completely off-topic answers). It scored as one of the highest 13B models on the leaderboard (~65 average), but appears to be removed now.