|
|
|
|
|
by hedgehog
64 days ago
|
|
Structurally a transformer model is so unrelated to the shape of the brain there's no reason to think they'd have many similarities. It's also pretty well established that the brain doesn't do anything resembling wholesale SGD (which to spell it is evidence that it doesn't learn in the same way). |
|
Substrate dissimilarities will mask computational similarities. Attention surfaces affinities between nearby tokens; dendrites strengthen and weaken connections to surrounding neurons according to correlations in firing rates. Not all that dissimilar.