|
|
|
|
|
by mumblemumble
1149 days ago
|
|
Also, multi-head attention strikes me as being about as close to how language semantics seems to actually work in human brains as I've seen. Lots of caveats there, of course. First off, I don't know much about the neurology, I just have an amateur interest in second language acquisition research that sometimes brings me into contact with this sort of thing. On the ANN side, which is closer to my actual wheelhouse, we definitely don't actually have any way of knowing if the actual mechanism is all that close, and I'm guessing it probably isn't even close since ANN's don't actually work that similarly to brains. Nor does it need to be, but, intuitively, there's still something promising about an ANN architecture that's vaguely capable of mimicking the behavior of modules in an existing system (human brains) that's well known to be capable of doing the job. I'm not super wild about the bidirectional recurrent layers, either, because they impose some restrictions that clearly aren't great, such as the hard limit on input size. et cetera. But it still strikes me as another big step in a good direction. |
|