Hacker News new | ask | show | jobs
by brandall10 988 days ago
I believe it's actually 8 x 220B. Just want to make it clear it's not simply a MOE GPT-3.5.