I've been wanting something like that for a long time. Eye tracking + a simple keyboard shortcut to trigger a click so I can select anything across the entire system instantly without having to take my hands off the keyboard. So far though I haven't been able to find any decent out-of-the-box solutions for that without any major drawbacks.
Eye-tracking is an underrated user input method in my opinion. I'm hoping eventually VR/AR will make it mainstream.
No, you blink all the time. There have been many, many of these things tried and they all suck quite frankly. It’s best to find some other way to confirm a click, or perform a scroll, etc.
Webcams don’t have enough resolution to be so precise, need really good models to track your head and estimate pose, etc. You also aren’t illuminating anything with a web cam so dark eyes become difficult, glasses non-linearly, warp the eyeball, and a host of other problems. Webcam based approaches usually get very very rough areas that move around like crazy (very noisy).
Essentially the best way to do it is to have something mounted to your face that’s purpose built. There are external eye trackers that are attached to monitors that can do pretty well, but are usually very expensive. The cheaper ones aimed at gamers are quite inaccurate.
Until your hands start hurting like the person who I mentioned. I was trying to discuss solving the general problem. Some people also suffer when using a keyboard
It's one keypress to get the hints, then usually two to select the element, whereas tab-navigation is 1/4 the number of elements on the page (if you do shift+tab whenever it's shorter) and this Manhattan-style keyboard navigation is 2 x sqrt(# elements), if I'm not mistaken.
Here's what I thought just reading the title: press a key, and the elements are spacially mapped onto all keyboard keys (except ESC). Then, hit the key in the general vicinity. If there's too much and you are likely to miss, zoom into the specific area and repeat.
That's really equivalent to the vimium method, but a bit more visually intuitive.
You might want to see this pre alpha experiment for typing in keyboardless XR. For me these days when I hear 'Spatial' I think 3D systems with ray tracing controllers and hand tracking. You can also follow its dev on Twitter for more details.
https://atap.google.com/soli/
I always think of this blog where the person had to use their nose because their hands were in pain.
http://www.looknohands.me/
Discussion from 7 years ago:
https://news.ycombinator.com/item?id=8805053