| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by bisonbear 31 days ago
	I've been building a tool to do this - build a dataset based on tasks from your repo, then A/B test the agent with whatever change you're making to determine the impact prior to actually shipping it. If you want to check it out - stet.sh