|
|
|
|
|
by refulgentis
1057 days ago
|
|
Read this as if I'm smiling and shaking my head. I'm not upset, I call it a quixotic quest because there's little chance of correcting it given how far it diffused, how few people understand the nuts and bolts, and by far the biggest factor IMHO: confirmation bias. You cited geohot as an expert on OpenAI[1], and to indicate skepticism Altman denied it, you fixated on the # of parameters, cited a Verge link to a chart in a random tweet about 100 trillion parameters, that it didn't show Sam Altman, and it didn't ask Altman about 100 trillion parameters specifically. And if it did, what does that have to do with mixture of experts? I flipped to 3 to -2 within 30 minutes of you posting this. "A lie gets halfway around the world before the truth has a chance to get its pants on." - Churchill [1] never worked at OpenAI, no notable domain expertise, and a Twitter intern in 2022. |
|
2022/11/11: A viral tweet claims GPT-4 will have "100 trillion parameters."[1] At this point, there were no rumors about mixture of experts.
2023/01/16: In an interview, Sam Altman mentions he saw the tweet and it was "complete bullshit."[2]
2023/06/20: geohotz and the lead of PyTorch, two people who would be expected to have relevant connections, claim that GPT-4 is an 8 x 220B mixture of experts model.[3]
These are two separate, unconnected rumors. One was denied by Sam Altman and was never plausible in the first place. The other was never denied and is highly plausible. You are conflating them by claiming, without any source, that there was "a clear denial from OpenAI's CEO" that "GPT4 is a trillion parameter mixture of experts model."
[1] https://twitter.com/andrewsteinwold/status/15948895625260277...
[2] https://youtu.be/ebjkD1Om4uw?t=313
[3] https://twitter.com/soumithchintala/status/16712671501017210...