|
|
|
|
|
by gtmtg
2857 days ago
|
|
Don't think it's exactly what you're talking about (I'm sure there are other works much closer to what you have in mind, just can't recall off the top of my head) — but you might find PoseNet (https://www.cv-foundation.org/openaccess/content_iccv_2015/p...) interesting. Not explicitly 3D, but estimates where in a large-scale scene a picture was taken using an end-to-end convolutional network. With that said, I think there's still a ton of merit in classical geometric approaches like ICP — there's a real, geometric basis to why they work. Convolutional networks can demonstrate some pretty amazing results, but they're still mostly "black boxes" to us, and a consequence of this is that it's hard to understand why they work and predict when they'll fail. This blog post (by the PoseNet author, actually) articulates the viewpoint well: https://alexgkendall.com/computer_vision/have_we_forgotten_a.... One recent research direction that I personally find really fascinating is designing deep learning architectures around real geometric properties, e.g. as in Skydio's deep stereo work: https://arxiv.org/pdf/1703.04309.pdf |
|
[1] Web-browser demo: https://storage.googleapis.com/tfjs-models/demos/posenet/cam... [2] Github: https://github.com/tensorflow/tfjs-models/tree/master/posene...