Could a network be trained in software on powerful expensive hardware and then programmed onto some kind of neural FPGA that uses these memresistors to be used in power/space constrained systems?
Short answer: yes.
Basically there are two options for future ReRAM (memristive) learning system: in-situ/on-chip learning , in which all learning rules are locally derived and enforced, and ex-situ learning systems in which we do what you suggested- import weights from more computationally/power expensive substrates. there is probably abundant promise in both approaches moving forward.
I recommend looking at some recent papers by the Strukov group [1] as well as my own [2] to see the limitations of these approaches. Strukov paper skirts around the issue to a certain degree but they admit in supplementary material scaling issues are not favorable with their approach. our work takes the 'neural FPGA' approach quite literally. But, their approach may , with some improvements , do rather well for an on-chip backprop implementation. Let's see what they do next.
Lastly, as far as hybrid approaches, there is a recent IBM paper which is really nice which talks about deep neural net acceleration with ReRAM. If you're really curious let me know and I"ll try to dig it up.
[1] http://www.nature.com/nature/journal/v521/n7550/abs/nature14...
[2]http://www.nature.com/articles/srep31932
For a simple multi-layer feed-forward network, couldn't you just just use "classical" components - seeing as the parameters of the network would never change (having been trained beforehand)? I.e. Would you actually need memristors?
Glad to hear you find it exciting. I do too.. its a really hot field at the moment and I mean that in the good, not bad way ;) Lots of groups working in parallel on somewhat orthogonal design and architecture issues, with a variety of different considered devices, but a common basis set is emerging ;)
So, here's the paper I mentioned above. I think this is very methodical and inventive and definitely one of the best yet at considering confluence of DNNs and memristive (ReRAM) devices. A quick search revealed this was already on HN.
https://arxiv.org/abs/1603.07341
So, I already mentioned the iconic Strukov paper above and my own which is really quite similar to Strukov in learning strategy/philosophy, except for we used entirely chemical and 'slow' devices , which may be quite interesting for brain emulation. (remember the brain operates in the mS , or microsecond regime and not nanosecond).
Here's another article I just stumbled upon a few days ago but which looks quite promising and brings us into the territory of a more un-supervised /probabilistic algorithm for learning.
http://www.nature.com/articles/ncomms12611