Hacker News new | ask | show | jobs
by mrguyorama 1024 days ago
Our eyeballs are not cameras and have way more depth info from their function than just two arrays of pixels that you can derive parallax from, and all the claims that "humans only use their eyes" fundamentally ignore all the other parts we use, up to and including an intrinsic simulation of physics in our brain.
1 comments

Yes, sure. Cameras and biological light sensing have different tradeoffs. My lay person's understanding is that the eye-brain neuron pathway bandwidth is not theoretically sufficient for what we perceive and so our brain is effectively running an ongoing simulation of the future a few miliseconds ahead of now and correcting based on sensory input.

The book "An Immense World: How Animal Senses Reveal the Hidden Realms Around Us" by Ed Yong [0] is really great for understanding how sensory input informs but isn't the same as a mental model of the world built into the operations of a living thing.

Likewise ADAS and similar systems do not operate simply on what is sensed at any particular moment. Even ahead of things like being blinded by a sunset, there are occlusions when one object moves behind another and cannot be directly detected but can be inferred by an object model that predicts future positions given the the earlier known velocity and acceleration. [1]

0. https://www.amazon.com/Immense-World-Animal-Senses-Reveal-eb...

1. Visual SLAM in dynamic environments based on object detection https://www.sciencedirect.com/science/article/pii/S221491472...

More than that, I mean eyes have more data than just what light is hitting their retinas. The work that the brain and neurons do to aim and focus your eyes at a distant object essentially solves several math problems that give you very direct distance info. Your brain knows that, if the angular deviation of your eyes away from parallel is X to aim at an object, then it is ~Y distance away. It also knows that, these muscles have to flex this much to focus on that object, which ALSO provides depth info to your brain. Solid state image sensors cannot provide either of those datasets.

These two processes are actually why VR can be difficult on the eyes, because while the main way your brain senses depth is the parallax (the classic "binocular vision" way people think of), the sense of focus is telling your brain that everything is right in front of your eyes.

The first rangefinder, micking this process mechanically, was invented in 1769. You’re essentially arguing for Lidar / sensor fusion.

Do you have any sources for this being a significant factor in human depth estimation? “Infinity” focus starts at 6 meters, yet we’re able to estimate much larger distances with great accuracy.

I looked up the history of the rangefinder and the work of Watt in the 1770s is kind of obscure. For one, he called it a “micrometer” [0] even though he also created something like what is called a micrometer today, only he called it an “end measuring machine.” Additional confusion comes from “telemeter” as an early term for a rangefinder. Only Watt was also there at the beginning of what we now call telemetry: “additions to his steam engines for monitoring from a (near) distance such as the mercury pressure gauge and the fly-ball governor.” [2]

Watt's micrometer, designed between 1770 and 1771, was what we would now call a 'rangefinder'. It was used for measuring distances, and was essential for his canal surveying work.

Adapted from a telescope, with adjustable cross-hairs in the eye-piece, it was particularly useful for measuring distances between hills or across water.

0. https://digital.nls.uk/scientists/biographies/james-watt/dis...

1. https://collection.sciencemuseumgroup.org.uk/objects/co59281...

2. https://en.wikipedia.org/wiki/Telemetry

You know cameras focus too.