|
|
|
|
|
by yobbo
1101 days ago
|
|
To experiment with SGD and back-propagation with 4096x4096 32-bit matrices, you would need a machine with hundreds of megabytes of ram in the 90s. In terms of software, you would need to be comfortable with C/C++ or maybe Fortran to be able to experiment quickly enough to land on effective hyper parameters. Probably too many low-probability events chained together. But I think they discovered most of the interesting things that small networks can do? For example, TD-Gammon from 1992: https://en.wikipedia.org/wiki/TD-Gammon . |
|