Hacker News new | ask | show | jobs
by lachlan_gray 1152 days ago
Since when we talk, our tongues tap patterns on the roof of the mouth and the back of the teeth, I wonder if AI processing could infer what words you are shaping from these sensors. Maybe it’s possible to input text by mouthing words silently, but without opening your mouth. Kind of like how it’s possible to eavesdrop from just the sound of keyboard clicks:

https://github.com/ggerganov/kbd-audio

2 comments

Tongue contact might be sufficient (in linguistics, two of the axes of "pronunciation space" are "dental" (whether the tongue makes contact with the teeth) and "palatal" (whether the tongue makes contact with the palate).

There are a number of other dimensions however that are equally important in the creation of word-sounds (e.g., whether the lips are pursed, whether the vocal folds are vibrating, whether the teeth make contact with the lips, where the tongue is located in the space of the mouth [for vowels], etc) and would make determination just from the dental/palatal axes pretty difficult I think. But maybe with enough context, you could get something predictive that is more than good enough, even if it's not into deterministic territory

I think you're talking about subvocal recognition [1]. People are indeed using ML for it, but it looks like it's more complicated than it appears. Still, I think it's only a matter of time before it's available to the average consumer, which I can't wait for because I've wanted something like this for a long time. I do my best thinking when I'm hiking, and I'd love to be able to dictate my thoughts on the move without looking like I'm talking to myself out loud (even though I am, I guess) in public.

[1] https://en.wikipedia.org/wiki/Subvocal_recognition

Several years ago, I was on a long solo drive and thinking about how I would like to be able to communicate with my computer in a subvocal manner. I stuck my pinky finger in my ear canal and "listened" to the deformations of the canal as I spoke, and thought "with a deformation scanner and good machine learning, this could totally work". Later, I registered the domain silentbuds.com to trial the idea but never pursued it. Just did some Googling and see that there are a few new research papers on this approach.