Hacker News new | ask | show | jobs
by ArthurBrussee 599 days ago
The input to this are two things - images, and camare poses. The camera poses tell you where each camera was in 3D space (and some of its properties).

The training takes this information, to make a 3D model out it, visually matching all your photos.

COLMAP can still be quite expensive & a hassle sadly, order half hour, as opposed to seconds. There are modern alternatives like https://lpanaf.github.io/eccv24_glomap/, or even deep learning based systems like https://github.com/naver/dust3r

This is definitely still a big blocker to adoption. The goal is to get to a more all-in-one system. The splatting optimization can also help align cameras, if they don't start out entirely random, so any system to quickly provide a good "initial guess" will help here. At least for mobile devices, initialization from ARCore / ARKit poses should be enough.

Keep an eye out :)

1 comments

If you're capturing on a mobile device, why not use Scaniverse? It's about as all-in-one as it gets - you just scan and it'll generate a .ply after a minute or two of processing. They'll host the splat for you in the cloud if you want too.
For me, at least, I want to own all my data, and not give any away without explicit permission. So, even in the case of Scaniverse, I'm reluctant.

But I'm just an artist trying to read and learn, and haven't yet gotten around to actually figuring out how to do all this on my Macbook Pro M1 yet ^-^