Hacker News new | ask | show | jobs
by stfwn 1909 days ago
Conventional methods of rendering 3D objects and spaces rely on specifying geometry and material properties in some format. You then simulate a viewpoint using that info and physics simulations.

A NeRF takes over both the role of the file format and part of the rendering in the form of a neural network. You feed in a world coordinate and a viewpoint and you get an RGB tuple and density out of it. If you interrogate the NeRF enough you can render any traditional 2D or 3D image out of it by combining all the datapoints.

One theoretical benefit is that that a NeRF is a continuous function, so the resolution is only limited by the capacity of the neural network. Another cool thing is that a NeRF is trained on pictures (with info about where they were taken from), so if you train a NeRF successfully in high-res it’s like scanning an object. A major practical challenge is that it is (was?) pretty frickin’ slow to work with. I wrote a more elaborate comment about it on the previous NeRF improvement post [1]. There I closed with:

> It would be amazing to have NeRF-based graphics engines that can make up spaces out of layers of NeRFs, all probed in real-time.

Here they’ve taken a major step in that direction by speeding up the rendering 3000X.

[1]: https://news.ycombinator.com/item?id=25300283

1 comments

This technique isn't actually speeding up the NeRF rendering algorithm.

It bakes the NeRF back to a semi-discrete representation (Octree of Spherical Harmonics voxels) which can render near-identical results at interactive speeds.

The baked data is much larger than the original NeRF model (2gb vs 5mb), but they can be downsampled to 30-100mb with little loss in quality.

So if I understand right, for the real-time version rather than querying the NeRF to compute the frame pixels on the fly, they instead use the NeRF to pre-generate 3D Voxel data representing the scene which can then be rendered in real time using more traditional voxel rendering?
Yes and No.

This preserves the exact lighting equation that the NeRF learned, while traditional voxel rendering is limited to traditional lighting equations.

You would have a hard time voxelizing a NeRF, because you can't extract a traditional lighting equation out of it.

I think this is sensitive to Hinton's work on capsules, which I believe is a more reprojectible primitive. Maybe you can coax a voxel
The point about spherical harmonics hits home. You could sample the different harmonics with a probabilistic scattering to construct a probability distribution for a signed distance function render, and use half the very solid render pipeline