|
|
|
|
|
by refulgentis
482 days ago
|
|
At some point, I have to say to myself: "I do know things." I'm not even sure what the alternative theory would be: no one stepped up to dispute OpenAI's claim that it is, and X.ai is always eager to slap OpenAI around. Let's say Grok is also a pretraining scale experiment. And they're scared to announce they're mogging OpenAI on inference cost because (some assertion X, which we give ourselves the charity of not having to state to make an argument). What's your theory? Steelmanning my guess: The price is high because OpenAI thinks they can drive people to Model A, 50x the cost of Model B. Hmm...while publicly proclaiming, it's not worth it, even providing benchmarks that Model A gets the same scores 50x cheaper? That doesn't seem reasonable. |
|
It seems this may be an older model that they chose not to release at the time, and are only doing so now due to feeling pressure to release something after recent releases by DeepSeek, Grok, Google and Anthropic. Perhaps they did some post-training to "polish the turd" and give it the better personality that seems to be one of it's few improvements.
Hard to say why it's so expensive - because it's big and expensive to serve, or for some marketing/PR reason. It seems that many sources are confirming that the benefits of scaling up pre-training (more data, bigger model) are falling off, so maybe this is what you get when you scale up GPT 4.0 by a factor of 10x - bigger, more expensive, and not significantly better. Cost to serve could also be high because, not intending to release it, they have never put the effort in to optimize it.