Hacker News new | ask | show | jobs
by xiaoyu2006 4 days ago
This open source model is quite near SOTA with only 700B/40B MoE. Truly efficient.