Hacker News new | ask | show | jobs
by Mathnerd314 795 days ago
No, the 8x22 (~140B) is more powerful than the 70B, it had higher eval score according to the blog. But 70B was based on LLama whereas the 8x22 and 7B were mixtral/mistral-based.