|
|
|
|
|
by soulofmischief
288 days ago
|
|
Increasing the fidelity and richness of training data does not go against the bitter lesson. The model can learn 3D representation on its own from stereo captures, but there is still richer, more connected data to learn from with stereo captures vs monocular captures. This is unarguable. You're needlessly making things harder by forcing the model to also learn to estimate depth from monocular images, and robbing it of a channel for error-correction in the case of faulty real-world data. |
|