Hacker News new | ask | show | jobs
by devit 850 days ago
I think the conjecture is that each expert of GPT-4 has 220B parameters, for a total of 1.76T parameters.