Hacker News new | ask | show | jobs
by michaellee8 101 days ago
Interesting, I have built https://github.com/michaellee8/voice-agent-devkit-mcp exactly for this, launch a chromium instance with virtual devices powered by Pulsewire and then hook it up with tts and stt so that playwright can finally have mouth and ears. Any chance we can talk?
1 comments

That's actually interesting. Is it a dependancy on user to create the HTTP endpoints for the /speak and /transcript?

One of our learnings has been to allow plugging into existing frameworks easily. Example - livekit, pipecat etc.

Happy to talk if you can reach out to me on linkedin - https://www.linkedin.com/in/tarush-agarwal/

Just sent an connection invitation on Linkedin. This is actually designed for allow e2e automation using playwright-mcp for a previous startup i worked in that does voice-based job interview agents. The http endpoints is provided by a daemom sitting on the background, listening all input to the virtual mic and transcribing and storing it. The agent can hit /speak and /transcript through an mcp. We have built Livekit Agents specific solutions by injecting text responses but felt that is not enough since we want to be able to test the whole thing end to end so I hacked a way to do virtual mic/speaker. It was designed for closing the dev-test-debug loop so that Claude Code can develop on its own rather than relying on human to test it.