Hacker News new | ask | show | jobs
by behnamoh 166 days ago
sure, but this stuff is only obvious post hoc. so many people have tried to "justify" the attention mechanism according to their area of expertise, but none of them came up with it first; ML engineers with ML thinking did.