Hacker News new | ask | show | jobs
by CuriousCosmic 672 days ago
> This all is well and good when you are just using for a pretty visualization, but it appears gaussians have the same weakness as point clouds processed with structure from motion, in that you need lots of camera angles to get quality surface reconstruction accuracy.

The paper actually suggests the opposite. That gaussian splats actually outperform point clouds and other methods when given the same amount of data. And not just a little bit, but ridiculously so.

Their Gaussian splatting based SLAM variants with RGB-D and RGB (no depth) camera input both outperform essentially everything else and are SOTA (state-of-the-art) for the field. RGB-D obviously outperforms RGB but RGB data when used with gaussian splatting performs comparably to or beats the competition even when they are using depth data.

And not just that but their metrics outperform everything else except for systems operating on literal ground truth data but even then they perform comparably to those ground truth models within a few percent.

And importantly where most other models run at ~0.2-3fps, this model runs several orders of magnitude faster at an average 769fps. While higher fps doesn't mean much past a certain point, importantly this means you can do SLAM on much weaker hardware while still guaranteeing a WCET below the frame time.

So this actually is a massive advancement in the SOTA since gaussians let you very quickly and cheaply approximate a lot of information in a way you can efficiently compare against and refine against the current inputs from sensors.

1 comments

I will believe this when I can actually measure scenes from Gaussians accurately (I have tried multiple papers worth of experiments with dismal results). No one in the reality capture industry uses splats for anything else other than visualization of water and sky heavy scenes because this is where a Gaussian splat actually renders in a nice way. I look forward to the advancements that Nerf and GS but for now there is no foundational reason why they can extrapolate any more data than COLMAP or GLOMAP when the input data is the major factor in defining scene details.