|
|
|
|
|
by lucubratory
1155 days ago
|
|
I don't know anything about SNNs, so I think I'm who you're asking. Something I'm really interested in is if there's any possibility of transferring training from normal NNs to the sort of physically embodied things you're discussing. Like how RWKV trains its weights like it's a transformer but then acts on them like it's an RNN, would it be possible to do training with the sort of large deployed NNs that are all the rage right now, but then somehow instantiate those weights into the hardware you're discussing? I'm guessing it's non-viable as-is because of the discrete nature of SNN function, and rounding up or down probably doesn't work, but I would be interested in anything you have to say on it. Also, I read years ago about a project that was similarly instantiating NNs physically, but it was using optical properties of layered plates to perform the equivalent of weights, do you know anything about that? I don't think it was discrete (can't see why it would be operating on light), but I'd be interested in anything you have to say about that too. |
|
Regarding RWKV, someone actually trained a "SpikeGPT": https://arxiv.org/abs/2302.13939 That's a neat insight, which will be great for porting these models onto energy-efficient devices. But the learning problem is still the most interesting open question to me. If we crack that, we can scale down GPT-like models by several orders of magnitude since we can "re-learn" subproblems instead of "hardcode" a silly number of permutations, like the present models do. Neuromorphic hardware (brains included) lend themselves incredibly well to learning. We just don't know how to exploit that yet.
Regarding the optical layers, are you referring to optical chips like this one https://www.nature.com/articles/s41467-020-20719-7 ? That would be an example of using optics to implement your stateful transfer functions (https://en.wikipedia.org/wiki/Optical_neural_network), but there are several of other incredibly promising technologies such as memristors (https://en.wikipedia.org/wiki/Memristor), quantum materials (https://arxiv.org/abs/2204.01832) and even biologically based chips (https://en.wikipedia.org/wiki/Wetware_computer). My take on this is that these technologies exploit different principles of physics to "compute" in some way. But I like to think that our computational theories and principles are independent of the implementation substrates.
There's still a long way to go, but practically speaking, I'm convinced this kind of hardware will have profound consequences for the way that we compute today. We're talking at least 3 orders of magnitude in compute. Imagine ChatGPT running 1000 times as fast. It's ridiculous.