|
|
|
|
|
by p1esk
843 days ago
|
|
Scale is all we need for transformers (so far). It might also be all we need for ape brains. It’s not all we need for whatever elephant or dolphin brains evolved from. When this stops being the case for transformers, we will need something else. I’m just pointing out it’s not the case yet. |
|
If you really think it is all scale, train a 7T ResNet MLP based model for NLP. If scale is all you need, make a LLM without DPO or RLHF. If scale is all you need, make SD3 with a GAN. Or what about a VAE, Normalizing Flow, HMM? Do it with different optimizers. Do it with gradient free methods. Do it with different loss functions.
The bitter lesson wasn't "scale is all you need." That's just a misinterpretation.
Edit: It's fine to disagree. We can compete on the ideas and methods. That's good for our community. So continue down yours, and I'll continue down mine. All I ask is that since your camp is more popular, you don't block those on my side. If you're right, we'll get to AGI soon. If we're right, we still might. But if we're right and you all block us, we'll get another AI winter in between. If we're right and you all don't block us, we can progress without skipping a beat. Just don't put all your eggs in one basket. It isn't good for anyone.