| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by TeMPOraL 601 days ago

I agree about the EEG part. I was curious how they managed to get that work, and found [0], which seem to confirm my guess: they didn't - they went for EMG instead. Now EMG sounds very plausible, given that it's well-understood, already applied for "controlling with thought" (prosthetics), and a person can learn to make their signal more clear/intentional, easier for the machine to understand.

As for 54 bits per second[1], that's assuming healthy person speaking, which is not relevant here. Communication systems for people unable to talk, write or sign because of ALS, paralysis, or similar things, do not have to aim for 54 bits per second! A few bits per second is already great! The alternative is no communication, or like half a bit per second but only when you're paying very close attention.

Here are some quotes from [0] about the most important aspects of the solution:

> “The LLM expands what you’re saying. And then I confirm before sending it back. So there’s an interaction with the LLM where I build what I want it to say, and then I get to approve the final message,” explained Pedro. (...) “The LLM that takes a basic prompt and expands it into a fully fledged answer, almost right away. I wouldn’t have time to type all of that in the natural way. So I’m using the LLM to do the heavy lifting on the response,” he added.

> He also pointed out that the wearer has absolute control of what they are outputting: “It’s not recording what I’m thinking. It’s recording what I want to say. So it’s like having a conversation.

So no magic here. Seems like a direct combination of:

1. Using EMG as input to get specific words/phrases;

2. Using LLM to expand those into full-blown sentences;

3. Using a TTS model to sound it out in a person's voice.

Feels like 2 and 3 could be applied to existing solutions across various ranges of illness and disabilities.

[0] - https://techcrunch.com/2023/08/18/communication-using-though...

[1] - Or is it 39? https://www.science.org/content/article/human-speech-may-hav....

1 comments

swordsmith 601 days ago

Thanks for the reference, I suspected it would be EMG as well. Especially in the video you can see how the patient modulates his eyebrow, facial muscles, and mouth. The vestigial muscle movements can be decoded to speech with the help of LLM much more easily. Actually, if form factor was not a concern, this can be done even more easily with other sensors as well.

link