|
|
|
|
|
by kabirgoel
404 days ago
|
|
This is great. Poking into the source, I find it interesting that the author implemented a custom turn detection strategy, instead of using Silero VAD (which is standard in the voice agents space). I’m very curious why they did it this way and what benefits they observed. For folks that are curious about the state of the voice agents space, Daily (the WebRTC company) has a great guide [1], as well as an open-source framework that allows you to build AI voice chat similar to OP's with lots of utilities [2]. Disclaimer: I work at Cartesia, which services a lot of these voice agents use cases, and Daily is a friend. [1]: https://voiceaiandvoiceagents.com
[2]: https://docs.pipecat.ai/getting-started/overview |
|