|
|
|
|
|
by smonn_
1149 days ago
|
|
There's plenty of interesting neural network designs out there but they're being overshadowed by transformers due to their recent success. I personally thing that the main reason transformers work so well is because they actually step away from the multi layer perceptron stuff and introduce some structure and in a way sparsity. |
|
Lots of caveats there, of course. First off, I don't know much about the neurology, I just have an amateur interest in second language acquisition research that sometimes brings me into contact with this sort of thing. On the ANN side, which is closer to my actual wheelhouse, we definitely don't actually have any way of knowing if the actual mechanism is all that close, and I'm guessing it probably isn't even close since ANN's don't actually work that similarly to brains. Nor does it need to be, but, intuitively, there's still something promising about an ANN architecture that's vaguely capable of mimicking the behavior of modules in an existing system (human brains) that's well known to be capable of doing the job. I'm not super wild about the bidirectional recurrent layers, either, because they impose some restrictions that clearly aren't great, such as the hard limit on input size. et cetera. But it still strikes me as another big step in a good direction.