| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by zaroth 2426 days ago

The videos showing the algorithm in practice are really nice demos.

I’m curious how big of a step forward this is from the previous state of the art, and at what computational cost.

Also curious if the technique scales well with multiple cameras with overlapping fields of view. That is to say, I assume accuracy can be increase through sensor fusion in the basic sense of averaging errors, but actually molding a cohesive 3D view of a 360° environment and understanding that an object at the end of one frame is the same object from a different perspective at the end of another camera frame.

Obviously this seems like it should be extremely useful for AutoPilot. Compared to the relative inaccuracy of the positional information of adjacent cars on the AutoPilot guidance display that we have today this seems like a big step forward.

I think it’s interesting how the RNN is identifying specific types of objects and then depth mapping them. I assume it can’t just depth map the whole image without that first classification step? I’m thinking like for the Smart Summon application where depth mapping everything around you is pretty crucial and obviously not entirely working at this point.