| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by gtirloni 840 days ago

So your requirements are:

1. Reliable speech to text from multiple possibly identifiable speakers

2. Long-term knowledge storage and retrieval

Speech to text is a solved problem, AFAIK with a caveat: single speaker. You'd need to train a local AI to identify all these different voices reliably. No easy feat.

Assuming you have done that, you have to feed that data into a vector database to retrieve it when you're talking to the AI. You can't use it to train the AI because it would be too expensive. But then you hit another roadblock: you either have very good querying capabilities for that database so you're able to retrieve what matters and feed into the prompt; or your context window is huge. The latter is expensive.

Some commercial LLM implementations are already implementing some form of learning based on previous chats, so it might be doable from a cost perspective.

I think you can't fit the necessary computing power into a wristband today. It needs to take care of speech to text (again, multiple speakers), uploading all of that to some cloud, and do it for hours and hours non-stop.

Maybe it could just be a smart microphone that uploads a constant stream of audio to the cloud for processing? A privacy nightmares no one is willing to touch most likely. Would you have to ask permission from anyone in the room before you enter with your microphone?

2 comments

Someone 840 days ago

> Speech to text is a solved problem, AFAIK with a caveat: single speaker. You'd need to train a local AI to identify all these different voices reliably. No easy feat.

The OP asks for “a device that passively listens to your conversations”, so even if single speaker is solved perfectly (I wouldn’t know, but have my suspicions, certainly for a device worn on the wrist, which means it can rotate, be covered with a sweater, etc), that isn’t enough.

link

slymersonn 839 days ago

does that mean that the only viable wearable is smart glasses? you can say the same thing about necklaces for example (they'll be covered).

link

slymersonn 839 days ago

thanks for the explanation! I'm thinking of creating this just as a project and for me

link