Hacker News new | ask | show | jobs
by yobbo 1101 days ago
To experiment with SGD and back-propagation with 4096x4096 32-bit matrices, you would need a machine with hundreds of megabytes of ram in the 90s. In terms of software, you would need to be comfortable with C/C++ or maybe Fortran to be able to experiment quickly enough to land on effective hyper parameters.

Probably too many low-probability events chained together.

But I think they discovered most of the interesting things that small networks can do? For example, TD-Gammon from 1992: https://en.wikipedia.org/wiki/TD-Gammon .