| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by mscharrer 1943 days ago
	Isn't the observer model explicitly built into NeRF architectures?

1 comments

mjburgess 1943 days ago

It is; ive just skim read a paper.

Though, One trivial way to do it, with NNs in any case, is just to project forward from a range of observer models and guess the observer parameters from them.

This is still the wrong sense of generalisation. What cant be guessed is why a person took consecutive pictures at given angles/etc.

Such information is necessary to resolve deep ambiguities in cases where your observer model will fail.

Eg., yesterday i looked out my window and thought i saw two people; it was actually one with a shadow+bag.

I moved my eyes/head/body in such away so as to fit a variety of models and i was able to 'read the scene' in the end.

That is comprehension.

link

jmmcd 1943 days ago

And there's no reason we couldn't have a deep learning system where the input data (images) included time-stamps and movement vectors, and it could be good both at easy image classification, and at choosing particular "head movements" like those you performed, to help resolve ambiguous cases.

Further food for thought: these ambiguous cases seem (do you agree?) to be very rare.

link

mjburgess 1942 days ago

Ambiguity is the norm, it isn't rare. Almost all visual input, ie., light, is ambiguous. We (animals) use the history of our prior geometrical-light experiences (ie., walking around) to use environmental cues to resolve ambiguity.

That billions (, trillions) of images are needed to aproximate what we can do for a single instance, i think is a good guide to the magnitude of the problem.

Google the "amnes room" -- that "illusion" is how we are always seeing.

link

jmmcd 1940 days ago

> Ambiguity is the norm

Right. But as you say, usually our priors are good enough. The cases where we stop, double-take, and deliberately look from another angle are rare.

link