Hacker News new | ask | show | jobs
by gammalost 824 days ago
What do you mean with "modern NN" that differ from the basic FNN?

How do you train a modern NN if not through backpropagation?

1 comments

Fnn are easy for sequential data because there is no relationship between the elements. Hence, these NN are a frequent target of simplified analyses. However, they also never led to exciting models which we now call AI.

Instead, real models are an eclectic mix of attention or other sequential mixers, gates, ffn, norms and positional tomfoolery.

In other words, everything that makes AI models great is what these analyses usually skip. Of course, while wildly claiming generalized insights about how AI really works.

There’s a dozen papers like that every few months.