Hacker News new | ask | show | jobs
by furyofantares 320 days ago
I certainly expect a human to do better here but if you wanna show it, giving a one line prompt to 2nd best LLMs to one-shot it isn't really the way to do it. Use Opus and o3, and give it to an agent that can measure things and try more than once.
1 comments

Great idea. Which agent to use?

I tried with opus and o3 but I had to copy/paste the code and I wasn't sure it was the best way.

I tried 10 prompts and the simplest was the best (probably due to the code being simplistic)

Also it wasn't done by a human but by my tool (the code in the repo is decompiled bytecode)
After reading another comment I'm not sure my suggestion is any good, it may not test looking at code and improving it and instead test "writing optimized mandlebrot in java" which it has probably seen some great examples of.
This matches my experience with AI agents. Wiring up the correct feedback and paying attention to ensure they use it is important. Tests and linters are great, but there's usually much more that human devs look at for feedback, including perceived speed and efficiency.