Hacker News new | ask | show | jobs
by tejask 4113 days ago
there are many ways to parametrize the decoder. One of the ways is to constrain it to output an explicit mesh or volumetric representation and express the rendering pipeline so that it's differentiable. The encoder will then effectively learn an "inference algorithm" to get the best output. A feedforward neural network is not enough and recurrent computations will eventually be necessary.
1 comments

Can you explain a bit more why the recurrent network structure becomes necessary at some point? Is that because reversing a CNN naturally means rendering by (de)convolution?
In order to approximately learn a "real" graphics engine with support for basic physics, just feed-forward computation might not be sufficient. A more natural way to learn graphics/physics might be to learn the temporal structure more explicitly. On the other hand, it might also be interesting to just add temporal convolution-deconvolution structure in the existing model. This is work in progress though.