| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by osti 55 days ago
	> propose, implement, measure, keep the wins Pretty much what I did to let Codex with gpt5.4xhigh improve my fairly complex CUDA kernel which resulted in 20x throughput improvement.

1 comments

hackyhacky 55 days ago

Concretely, what interesting changes did it make to achieve such a significant improvement?

link

osti 55 days ago

A lot of it was beyond me, but this was all the branch names for all the stuff it tried, most of it unsuccessful of course. About 10x perf improvement came from architectural changes, and then 2x from micro optimizations.

https://pastebin.com/eac0SAYg

link