|
|
|
|
|
by 0x1ceb00da
619 days ago
|
|
This suggests that the AI "brain" receives the user input as text prompt (agent relays the speech prompt to GPT-4o) and generates audio as output (GPT-4o streams speech packets back to the agent). But when I asked advanced voice mode it said the exact opposite. That it receives input as audio and generates text as output. |
|