| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by verdverm 249 days ago

ADK has a few pages and some API for evaluating agentic systems

tl;dr - challenging because different runs produce different output, also how do you pass/fail (another LLM/agent is what people do)