| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by furyofantares 320 days ago
	I certainly expect a human to do better here but if you wanna show it, giving a one line prompt to 2nd best LLMs to one-shot it isn't really the way to do it. Use Opus and o3, and give it to an agent that can measure things and try more than once.

1 comments

top256 320 days ago

Great idea. Which agent to use?

I tried with opus and o3 but I had to copy/paste the code and I wasn't sure it was the best way.

I tried 10 prompts and the simplest was the best (probably due to the code being simplistic)

link

top256 320 days ago

Also it wasn't done by a human but by my tool (the code in the repo is decompiled bytecode)

link

furyofantares 320 days ago

After reading another comment I'm not sure my suggestion is any good, it may not test looking at code and improving it and instead test "writing optimized mandlebrot in java" which it has probably seen some great examples of.

link

gsoltis 320 days ago

This matches my experience with AI agents. Wiring up the correct feedback and paying attention to ensure they use it is important. Tests and linters are great, but there's usually much more that human devs look at for feedback, including perceived speed and efficiency.

link