| My understanding of this paper (please comment/correct based on your understanding): * Instead of using a single latent feature per each implicit surface, the authors propose using a "volume" of latent features per each surface. This allows for the NNs to better capture the geometric detail while remaining relatively shallow. The result is a more accurate and faster to compute neural SDF. Contrary to the claim of another comment, the neural SDF alone is not the interesting part of this paper--the prior works points to at least three other papers that have explored the idea of representing an SDF with a neural net: Park et al.'s DeepSDF https://arxiv.org/pdf/1901.05103.pdf, Mescheder et al's Occupancy Networks http://www.cvlibs.net/publications/Mescheder2019CVPR.pdf, and Chen et al's Learning Implicit Fields https://arxiv.org/pdf/1812.02822.pdf All very interesting papers. * When I say a "volume" of latent features I specifically mean a voxel-grid where the corners of each voxel are latent features and any position X has a corresponding feature Z which is simply the trilinear interpolation of the features on the corners of the voxel. As the authors mention, they try to keep this sparse by leaving any voxel that does not contain the surface "empty". The authors use an octree to create L different feature volumes. As L becomes larger, the resolution of the feature volume increases which means that more fine grained details can be encoded as features. Finally, the authors describe a rendering procedure that makes use of their LOD model (still need to read this part more thoroughly). Some additional thoughts: Why are SDFs useful at all? One comment suggests this is a form of "compression" but meshes have a far smaller memory footprint and are computationally less expensive to render. Ray tracing is extremely fast, largely due to the fact that as a primitive operation in graphics so much time and energy has been invested into understanding how to make it faster with various acceleration structures, like BVHs. So are SDFs actually useful? Yes. Triangle or polygon meshes are great when you have them, but are terribly challenging to work with for reconstruction tasks. For instance, you effectively have to pause occasionally during reconstruction to fix your mesh up so that it isn't complete garbage (triangles with small angles, self-intersections, extremely lopsided side lengths, etc). SDFs support arbitrary topology painlessly, which is why they show up so much in reconstruction/computer vision. So why do we need neural nets to represent them? I think the primary reason you use a neural network to represent a signed distance function is because it's a more efficient representation than storing the SDF in some sort of grid structure (maybe someone else has more thoughts on this?). As a side benefit, it can simplify any sort of differentiable rendering since the surface itself already is represented in a manner that is naturally differentiable via back-propagation. |
Meshes and bezier patches are boundary representations with no information of the volume they enclose. Imagine you'd like to cut an object out of smoke or clouds. SDFs enable you to cut out arbitrary volumes from any material. This is more realistic than skinning a mesh object with textures, especially with translucent objects.
> So why do we need neural nets to represent them?
You don't. I didn't notice render times, but "interactive frame rates" would be lacking.
Most SDF primitives and their compositions are not analytic (from use of abs, fract, etc). Differentiation by finite differences is most common. Few are using automatic differentiation. I don't quite follow how back prop would produce the surface gradient, but I doubt it would be faster than these methods.