Hacker News new | ask | show | jobs
by modeless 725 days ago
I think collision detection is solvable. And the scanning process should be no harder than 3D modeling to the same quality level. Probably much easier, honestly. Modeling is labor intensive. I'm not sure why you say "there’s no scanner available that provides both good 3-D information and good photo realistic textures" because these new techniques don't use "scanners", all you need is regular cameras. The 3D information is inferred.

Lighting is the big issue, IMO. As soon as you want any kind of interactivity besides moving the camera you need dynamic lighting. The problem is you're going to have to mix the captured absolutely perfect real-world lighting with extremely approximate real-time computed lighting (which will be much worse than offline-rendered path tracing, which still wouldn't match real-world quality). It's going to look awful. At least, until someone figures out a revolutionary neural relighting system. We are pretty far from that today.

Scale is another issue. Two issues, really, rendering and storage. There's already a lot of research into scaling up rendering to large and detailed scenes, but I wouldn't say it's solved yet. And once you have rendering, storage will be the next issue. These scans will be massive and we'll need some very effective compression to be able to distribute large scenes to users.

3 comments

You are correct; most of these new techniques are using a camera. In my line of work I consider a camera sensor a scanner of sorts, as we do a lot of photogrammetry and “scan” with a 45MP full frame. The inferred 3D from cameras is pretty bad when it comes to accuracy, especially from dimly lit areas or where you dip into a closet or closed space that doesn’t have a good structural tie back to the main space you are trying to recreate in 3D. Laser scanners are far preferable to tie your photo pose estimation to, and most serious reality capture for video games is done with both a camera a and $40,000+ LiDAR Scanner. Have you ever tried to scan every corner of a house with only a traditional DSLR or point and shoot camera? I have and the results are pretty bad from a 3D standpoint without a ton of post process.

The collision detection problem is related heavily to having clean 3D as mentioned above. My company is doing development on computing collision on reality capture right now in a clean way and I would be interested in any thoughts you have. We are chunking collision on the dataset at a fixed distance from the player character (can’t go too fast in a vehicle or it will outpace the collision and fall thru the floor) and have a tunable LOD that influences collision resolution.

Have you looked into SkyeBrowse for video to 3D? Seems like it’s able to generate interior 3D textures pretty quickly.
Both my iPhone and my Apple Vision Pro both have lidar scanners, fwiw.

Frankly I’m surprised that I can’t easily make crude 3D models of spaces with a simple app presently. It seems well within the capabilities of the hardware and software.

Those LiDAR sensors on phones and VR headsets are low resolution and mainly used to improve the photos and depth information from the camera. Different objective than mapping a space, which is mainly being disrupted by improvements from the self driving car and ADAS industries
Magic Room for the AVP does a good enough job. Seems the low resolution issue can be augmented/improved by repeated/closer scans.
I feel like the lighting part will become "easy" once we're able to greatly simplify the geometry and correlate it across multiple "passes" through the same space at different times.

In other words, if you've got a consistent 3D geometric map of the house with textures, then you can do a pass in the morning with only daylight, midday only daylight, late afternoon only daylight, and then one at night with artificial light.

If you're dealing with textures that map onto identical geometries (and assume no objects move during the day), it seems like it ought to be relatively straightforward to train AI's to produce a flat unlit texture version, especially since you can train them on easily generated raytraced renderings. There might even be straight-up statistical methods to do it.

So I think it not the lighting itself that is the biggest problem -- it's having the clean consistent geometries in the first place.

There’s some exciting research on recovering light sources https://dorverbin.github.io/eclipse/
Neat! Yeah, we'll need a lot more stuff like this.