| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ilusion 2 hours ago
	I'm very curious to see a benchmark for this - have toyed with the idea myself but haven't put in the hard work to test these hypothesis on extracting learning signal from deep-agent traces.

1 comments

There's some benchmarks in the repo for AppWorld. Looks promising