| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by sumanyusharma 713 days ago

Absolutely agree that creating effective evals requires domain expertise. Right now, we're co-building evals with customers, but we're identifying which aspects can be productized.

Regarding text-based evals — part of testing voice agents involves assessing their core reasoning logic. To do that, we bypass the voice layer and simulate conversations via text. So yes, the core simulation engine is reusable for both conversational text and voice interactions.

We're also excited about shipping the ability to replay a simulated conversation inspired by a real user!