|
|
|
|
|
by w-m
817 days ago
|
|
Great project, it's nice to see so many people are building stuff with Gaussian Splatting. Just the other day I went through the whole list of known viewers on MrNeRF's awesome 3DGS resources, to find one that runs on a MacBook. I'm working on compressing 3D scenes by sorting Gaussians into 2D grids [0], and I wanted a native viewer that I could for experiments on the go.. perhaps as an alternative backend to the CUDA one in my colleague's exploratory Python viewer [1]. VulkanSplatting was the only one I could get to compile and run on my Intel MacBook. Unfortunately the feeble Intel GPU isn't able to display even the Lego scene at an interactive framerate. Do you think there's performance headroom, and that it will become possible in the future, or should I give up trying to run this on an Intel MBP? [0]: https://fraunhoferhhi.github.io/Self-Organizing-Gaussians/ [1]: https://github.com/Florian-Barthel/gaussian_viewer |
|
For my Apple Silicon benchmarks, the main bottleneck is the parallel radix sort that sorts the Gaussians by tile and depth. I used a some shaders from a sorting library, but it has some performance gaps with SOTA parallel sort algorithms. I think fixing this would give a 1.5x overall performance boost and maybe 3x on Macbooks. Also the wave size isn't tuned for different GPUs.
Another area of improvement is better management of the shared memory. Right now, we just let the driver manage it as the L1 cache. However, we could manage it manually and group Gaussian retrievals together for the same tile. This is what the official implementation does.
Although 3DGS is the first radiance field with SOTA quality that runs in real-time, I think it's still quite heavy. Due to the explicit representation of the scene, a lot of operations are memory bound. If you can't get an interactive frame rate right now, it's unlikely the improvements will make a material difference.
Hopefully that's where your work on compression comes in and solves the problem :)