Hacker News new | ask | show | jobs
by zwaps 822 days ago
Fnn are easy for sequential data because there is no relationship between the elements. Hence, these NN are a frequent target of simplified analyses. However, they also never led to exciting models which we now call AI.

Instead, real models are an eclectic mix of attention or other sequential mixers, gates, ffn, norms and positional tomfoolery.

In other words, everything that makes AI models great is what these analyses usually skip. Of course, while wildly claiming generalized insights about how AI really works.

There’s a dozen papers like that every few months.