Hacker News new | ask | show | jobs
by drakenot 68 days ago
I think voice mode uses weaker models, just an FYI relative to the SOTA
3 comments

The bigger problem for me is that the realtime voice modes lack tool use, so they can't look anything up or do anything. Model strength definitely also matters, but even dumb models can be helpful when they can look things up and try things out. And smart models that don't do those things kinda suck.
Can get around this with a local STT model and use text input but UX is probably clunkier
Definitely, seems like gpt 3