Hacker News new | ask | show | jobs
by bensyverson 37 days ago
There are now realtime “speech-to-speech” models [0]. I believe they skip text to streamline the architecture.

[0]: https://openai.com/index/introducing-gpt-realtime/