|
|
|
|
|
by nicklecompte
814 days ago
|
|
His point is that "evolution by selection" also includes that transformers are easy to implement with modern linear algebra libraries and cheap to scale on current silicon, both of which are engineering details with no direct relationship to their innate efficacy at learning (though indirectly it means you scale up the training data for more inefficient learning). |
|
Theoretical efficacy doesn’t guarantee real world efficacy.
I accept that this is self reinforcing but I favor real gains today over potentially larger gains in a potentially achievable future.
I also think we are learning practical lessons on the periphery of any application of AI that will apply if a mold-breaking solution becomes compelling.