Hacker News new | ask | show | jobs
by _nalply 1073 days ago
I remember to have read somewhere that GPT4 is not a single model but several models whose parameter counts are reported as a single sum. Perhaps quite doable but at lower speeds?
1 comments

The article linked here talks about GPT4 being a mixture of experts, which is exactly what you’re describing