| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by zammitjames 482 days ago

Great question! There’s been a lot of movement in this space, but most existing solutions focus on simulation-based testing—generating synthetic test cases or scripted evaluations.

Roark takes a different approach: we replay real production calls against updated AI logic, preserving actual user inputs, tone, and timing. This helps teams catch failures that scripted tests miss—especially in high-stakes industries like healthcare, legal, and finance, where accuracy and compliance matter.

Beyond replays, we provide rich analytics, sentiment & vocal cue detection, and automated evaluations, all based on audio—not just transcripts. This lets teams track frustration, long pauses, and interruptions that often signal deeper issues.

Would love to hear more about your assistant - how are you thinking about testing and iteration?

2 comments

stuartjohnson12 482 days ago

We are presently evaluating at least two of the options mentioned above and the pricing for 1000 minutes at both comes out to well under 10% the cost of your current listed rate. I know you've probably been told not to compete on price but this is a space where I think it's hard to compete on quality yet. As for other analysis features, I think you're going to find yourself locked into a commoditised feature race which you're currently 6 months behind on.

What nobody is doing well at the moment is effective prompt versioning, comparison, and deployment via git or similar. This would be a killer feature for us but nobody is close to having shipped it from what I can see.

michaelmior 481 days ago

> Roark takes a different approach: we replay real production calls against updated AI logic

How does that work? As soon as the AI responds with something different, the rest of the customer call is mismatched.