|
|
|
|
|
by crotchfire
848 days ago
|
|
Actually it is related. Transformers are just networks that learn to program the weights of other networks [1]. In the successful cases the programmed network has been quite primitive -- merely a key-value store -- in order to ensure that you can backpropagate errors from the programmed network's outputs all the way to the programmer network's inputs. The present work extends this idea to a different kind of programmed network: a convolutional image-processing network. There are many more breakthroughs to be achieved along this line of research -- it is a rich vein to mine. I believe our best shot at getting neural networks to do discrete math and symbolic logic, and to write nontrivial computer programs, will result from this line of research. [1] https://arxiv.org/abs/2102.11174 |
|