| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by panqueca 827 days ago

HumanEval Benchmark: 95.1 @ GPT-3.5

I wonder if it can be combined with projects like SWE-Agent to build powerful yet opensource coding agents.