Hacker News new | ask | show | jobs
by nojvek 311 days ago
Yeah there was on old paper that blew math/physics benchmarks out of the water by letting the LLM write code and having the physics engine execute it. I don't have a link to it off my head but that seems to be the right directly.

LLM + general tool use seems to be quite effective.