Hacker News new | ask | show | jobs
by cubefox 1089 days ago
Current Voice assistants aren't much more than hardcoded GOFAI software. The only relevant ML involved is in the speech-to-text model. Modern language models are on a completely different level. Problem is that they need absurd amounts of VRAM, so running them locally on a phone is out of the question.

> It isn’t at all clear what these interfaces can and cannot do, all you can do is keep trying different things until you get a result.

You would ask an LLM based assistant what it can do, in the same way you can ask ChatGPT what it can do.

2 comments

> You would ask an LLM based assistant what it can do

But this has the same problem that it's trying to solve in the first place: the LLM's behavior is unpredictable, and that includes its answers to questions like this. There's no guarantee that it won't hallucinate.

Maybe this can be ameliorated by giving it access to some hard-coded and highly vetted list of capabilities?

ChatGPT-4 doesn't hallucinate a lot about its own capabilities. A bit of hallucination is acceptable for mundane use cases.
> Problem is that they need absurd amounts of VRAM, so running them locally on a phone is out of the question.

Weren't most voice assistants running on a server until recent years anyway?

Yeah. I'm pretty sure this will return. But this might cost a subscription, as language models are expensive.