Hacker News new | ask | show | jobs
by musebox35 199 days ago
If you dig ml/vision papers from old, you will see that formulation-wise they actually did, but they lacked the data, compute, and the mechanistic machinery provided by the transformer architecture. The wheels of progress are slow and requires many rotations to finally reach somewhere.