Hacker News new | ask | show | jobs
by awenix 761 days ago
Nice to see an open source implementation, i have been seeing many startups get into this space like https://www.retellai.com/, https://fixie.ai/ etc. They always end up needing speech-to-speech models (current approach seems speech-text-text-speech with multiple agents handling 1 listening + 1 speaking), excited to see how this plays with recently announced gpt-4o
3 comments

Adding to your list: https://vapi.ai -- really nice tools.

(I try to keep up with all the different layers/players in this space.)

We're (fixie.ai) working on on our SLM (speech language model). We'll release something soon to play with :)
How do speech to speech models work? Do they just that many more tokens to capture nuances of spoken language?