Hacker News new | ask | show | jobs
by sp332 806 days ago
Could you recommend one or a few in particular?
1 comments

The current best open weights model is probably Cohere Command-R+. The memory requirements on it are quite high, though.
I really want to see some benchmarks with performance weighted by energy use. I think Mistral 7B performance to watt would be the leader by a huge margin. On many tasks I get equal performance on zero shot classification tasks on Mistral than in bigger models.