Y
Hacker News
new
|
ask
|
show
|
jobs
by
water-drummer
108 days ago
Gemini live api and grok voice api can make tool calls and they're speech to speech models
1 comments
d4rkp4ttern
108 days ago
Right, turns out Claude and ChatGPT voice can also do web-search. So I guess behind the scenes there is more than a "pure" voice-voice model being used, i.e. there's probably a rudimentary agent loop with tools + tool-exec interposed.
link