Hacker News new | ask | show | jobs
by ladberg 1031 days ago
I used to work on it and have spent tons of time in the headset. The eye tracking is next-level and it's really the only platform that exists with eye tracking as a primary input method. I'm pretty confident it will greatly improve your friend's quality of life.

Because of that, I'm also sure that eye tracking will go mainstream in other areas once the Vision Pro is released once everyone else catches on to it as a great input method.

4 comments

This is pretty much exactly why I vehemently disagree with Apple's decision to draw such a firm line in the sand preventing devs from accessing the eye/gaze data directly. I'm part of an academic spin-off start-up that specializes in analyzing gaze and movement data. Locking the gaze information outside of the app sandbox severely hampers the ability to quickly iterate design and UI patterns that could be game changing for accessibility. Hopefully they make accommodations moving forward for these circumstances.

The issue is doubly close to my heart because my father has ALS and is nearly at the point where eye-tracking will be his only means of communicating effectively with the world. While existing Tobii systems work well enough, typing with your eyes is still exhausting to do.

Ultimately I don't think a platform like the vision pro is suitable for ALS patients, especially later term. They cannot support the weight of the headset and/or fatigue will set in rapidly. Many (including my father) also require use of a ventilator, accompanied with a mask that can seal effectively enough to support the positive pressure necessary to inflate their lungs. Unless the form factor for HMD's minimalizes significantly, it will likely interfere with the respirator's efficacy.

I don’t know much about those medical conditions, but it doesn’t take that much imagination to understand that access to eye-gaze data would pretty much give developers mind-reading abilities against whoever is wearing the headset. As the platform matures it will probably be a whole discussion around how it works, who gets access to it and for what reason. I could imagine Apple putting their weight behind developing all sorts of wild disability features.
Developers control what goes in front of the user and where, we'll still be able to tell plenty about a user's decision making process given that and how their head and hands navigate the space. There are plenty of companies that specialize in this as their entire product offering, assessing fitness for duty, alertness, attention mapping, etc. Plenty of published research on the matter as well.

The supposed security of blackboxing the eye data itself is illusory and functionally just for marketing.

I want the next generation of UIs to be so easy and natural to use that it feels like they're reading my mind.
The level of eye tracking performance for general population interactions is really only possible when you control the illumination like in a VR headset. A Vision Pro might work for the friend in question. More generally this requires the full vr display to make it work. See through AR or just plane glasses will not be nearly as good, and I think that will cap the general acceptance.
Is it doing something fundamentally different from what everyone else is doing (infrared light source, do some flavour of pupil segmentation and pose estimation)?

Is the eye-tracking performance/accuracy step change on Apple's headset purely just a software/algo change? Or is it actually using a new principle/apparatus for eye-tracking?

I don't think there's anything revolutionary, just a lot of parts working very well in tandem:

- Multiple cameras per eye, and at a very short distance from your eye

- The screen is fixed relative to the cameras for all devices, there's no worry about that half of the equation getting off calibration or differing for every customer

- OS is built around eye tracking, which means there won't be any actions that are unnaturally hard to perform with eye tracking

Doesn't it rely on external cameras to see users hands and use that as the "click" inputs? Seems like that negates usage for ALS cases.

Also, I'm not an ALS expert, but if the only muscular control is in the eyes, then lack of control in the head/neck probably breaks some assumptions about how the vision headset works (just a guess though).

It does not require using one's hands to click, it supports various input hardware (keyboard, mouse, switch, etc.). If someone has control of basically any muscle, it can use a switch input. The Vision Pro also has Dwell Control, activating things by keeping your gaze on it long enough, but I don't know whether it can currently be solely operated using nothing but one's eyes.