Hacker News new | ask | show | jobs
by vanderZwan 1994 days ago
> Why hasn't those concepts seen more widespread use outside gaming?

I wrote my master thesis on designing gesture interfaces (of the Xbox Kinect type) back in 2012, here's my two cents:

First of all, gesture detection still isn't reliable enough for serious input. Missing a beat one in a hundred times is acceptable in gaming settings (and even then only in more "casual" environments like party games), but for serious input the input device needs to be practically 100% reliable. A keyboard press is. A mouse click is. Detecting whether your hand is an open palm or a fist? Not so much. Hence the peripherals we typically see for VR games, which help of course. At the same time they also somewhat defeat the purpose.

A second major issue is a lack of haptic feedback. There is no such thing as touch-typing in the air.

Why this is such a big problem needs a bit of explaining: a practical way to think of our ability to manipulate our environment (literally, the manos in manipulate referring to our hands) is to think of them as a pair of kinematic chains[0][1]. Essentially this is a chain of ever-more finegrained "motors", going from coarse-grained to fine-grained precision: our shoulders, our elbows, our wrists, and finally the digits of our hands. The ingenuity of this chain is that it allows for extremely fine precision (the sub-millimeter precision of our fingertips) in large spatial volume (the reach of our arms), and it does so by having each "link" in the chain perform a bit of "error-correction" for the lower resolution of the previous link.

What does this have to do with gesture interfaces? Well, in order for that kinematic chain to work, it needs a precise feedback system to perform said error-correction. We basically have three senses for this: our visual system (that is, seeing where we are putting our hands), our haptic sense (feeling which button we're pressing with our finger-tips) and our "spatial sense". The problem with the latter sense is that is relative: I sense the sub-millimeter location of my fingers relative to my wrist. I sense the millimeter-precision location of my wrist relative to my elbow. I sense the centimeter-precise location of my elbow relative to my shoulder. So if I'm waving my hands in the air without looking, the effective "precision" they have is about as crude as the crudest link in the chain: my shoulder. Of course this spatial sense can be improved with training, but you know what we typically call people who are really good at that? Professional-level dancers. The ceiling of mastering this skill is pretty high, and there's a reason it's basically a profession all by itself (plus a ton of other things obviously, don't want to sell dancers short here).

Gesture input also will never be as easy on the motor skills as typing: not only does a keyboard provide the haptic feedback from the keys, the precision of my fingers is relative to the wrists that are resting on the desk, not to my shoulders.

Games somewhat get around this by representing a visual avatar to give us feedback, but it's not perfect. On top of that, this feedback is limited by the resolution of the gesture detection, which is ludicrously low compared to the potential precision of our limbs. And if that wasn't enough, it also needs a really low latency to fool our brains and really "feel" like an extension to our senses.

So basically, the fidelity requirements are just brutally high.

And finally, there is only a limited set of use-cases. There are basically just two big ones: "touchless" interfaces (very niche) and pointing and manipulating in 3D space (less niche, with a clear advantage over keyboard or even mouse input, but again having brutally high fidelity requirements). Because of that, as cool as gesture interfaces are, the industry-wide drive to solve all the aforementioned issues just isn't quite as high as we'd like it to be.

[0] http://cogprints.org/625/1/jmb_87.html

[1] https://en.wikipedia.org/wiki/Kinematic_chain

2 comments

Back in 2012 I did some building on ZSpace[0] and the that setup still feels like the sweet spot: Physical keyboard, physical pen (with virtual extension), physical glasses to detect head position for parallax, 3D environment to play and create in.

The _visual_ feedback from moving your head and rotating objects with the pen were extremely low-latency. Gesture detection is still nowhere near that level of fidelity but with peripherals, perhaps it's not necessary.

[0] https://zspace.com/

Cool, that sounds a lot like Bill Buxton's experiments with two-handed input, but in VR!
What are your thoughts on peripherals that provide haptic feedback to help solve the feedback loop?

I am thinking gloves or other wearables that could provide force-feedback to help us correlate our physical actions to the virtual interface..

Are we making progress in those areas at all?

There were some people attaching some things other than pistol grips to Novint Falcon for use in VR in DK2 era but never heard the company went back to resume production.
> What are your thoughts on peripherals that provide haptic feedback to help solve the feedback loop?

Heh, sorry to disappoint but I've not done IxD design work/research of that kind since graduating, and not actively kept up with developments. Basically my well-informed thoughts are limited to "one of these days I'll save up the time and money budget to get a VR system, a gaming desktop powerful enough to run it, and the peripherals, to try those out and be overly critical of them, and I am fully aware that I'm lying to myself right now."

I'll indulge myself with some big-picture speculation though - but I want to emphasize that it's just that: speculation!

I'm sure that there is progress being made, but I suspect the major breakthroughs won't be made through VR, or at least not the gaming side of it. The reason for that is that game design is built on top of existing hardware, which in turn drives the need for more specialized hardware, so ultimately that will result in more specialized, niche tools.

Let me expand on that: we're talking about making "progress", but progress by which metric and for which purpose? What is "progress" in a VR game context? (To tie this into the peripherals you suggested: those are one possible solution, but to which problem exactly, and is that really a problem we're struggling with?)

Let's take graphics in games as historic point of comparison. For decades, marketing has tended to focus on "realistic", "next-gen" graphics. But given that a game with abstract pixelated graphics can still be extremely engaging, and that a game with extremely realistic graphics can still be boring, we might be over-estimating the value of realism in games there. To make an analogy-within-an-analogy: it took Ancient Greeks just one century to go from copying Egyptian sculpture techniques to near-perfect human anatomy[0]. Where did they go from there? Exaggerated beauty ideals! They mastered realism and then decided to ignore the parts of reality that didn't interest them.

How does that insight translate to input devices? I can't say for sure of course, but given that mouse-clicks have never felt like an obstruction to immersion, I suspect that the most fun gesture-based games might lean a bit less on realism than we expect. For gaming input we want to feel in control of our gaming avatar. So low-latency still is likely to be important, as is "perfect" input detection is important.

However, and this is my main point: we never stated that "perfect" input detection has to be realistic input detection! A lot of game input schemes are more about giving us the feeling of control and mastery of skill than "realism", whatever that means in this context.

So I wouldn't be surprised if gaming will turn out to be a dead end for generic interface innovations as a result, since the interfaces of games will only become more specialized to the purpose of making them engaging. No disrespect to game UX - there is a lot that general interface designers can learn from studying game design, since at its core it is about building aesthetic interactions. But it's not quite focused on solving the same issues.

I would place my long-term bets on immersive computing environments like Dynamicland[1] coming up with truly novel, more generic interfaces that break out of this problem, due to it having more use-cases for generic 3D input. But we'll have to wait and see (or join the people at Dynamicland and try to actively make progress in that department).

[0] https://en.wikipedia.org/wiki/Ancient_Greek_sculpture

[1] https://dynamicland.org/