| I was manually calling my Twilio voice agent 100 times a day to verify every single micro change. Tired of that, I built Rehearse. I know there is a lot of YC money going into voice testing companies, but I wanted to build something open source and code first so Claude Code can spin up and manage test cases. Example usage: - call.listen() -> get audio or transcript of what the agent is saying - call.say("I'd like to book a table for 2 at midnight") -> speak with the agent - assertions on responses It only supports Twilio (my use case) and ElevenLabs (transcription), with basic text and LLM based assertions for now. It makes real calls and is BYOK. I have a bunch of ideas in mind (not implemented yet, not sure if useful): 1. simulations like accents, background noise, languages, network issues, interruptions, etc 2. voice agent testing another voice agent 3. native audio based assertions 4. more connection options like Vapi, Retell, Websockets etc GitHub https://github.com/thenullterminator/rehearse PyPI https://pypi.org/project/rehearse/ Everything is a bit janky right now. Appreciate all your feedback! |
- Barge-in / interruption: user starts talking mid-agent-sentence, agent should stop + recover state. - DTMF flows + mixed-mode ("press 1", then spoken intent). Also: false DTMF (ASR hears "one" as tone). - Silence / dead air / voicemail: detect long silence, prompt once, then gracefully end; detect voicemail greeting. - Transfers: warm vs cold transfer, verifying you actually bridged the call + preserving context. - Telephony weirdness: jitter/packet loss, codec changes (PCMU vs OPUS), partial transcripts, delayed ASR. - Guardrails: PII capture + confirmation, profanity de-escalation, "agent must not comply" tests.
One UX thought: record/replay (store the raw audio + timing) so regressions are deterministic and you can run “golden” call fixtures in CI without placing a real call every time.
(We build production voice agents at eboo.ai; happy to share a small bundle of “gotcha” scenarios if useful.)