Hacker News new | ask | show | jobs
by neonate 482 days ago
Looks like that's also at https://sakana.ai/ai-cuda-engineer/#limitations-and-bloopers
1 comments

Since i posted https://news.ycombinator.com/item?id=43124176, they have revised again to acknowledge that many of the other generated kernels are also broken:

> Furthermore, we find the system could also find other novel exploits in the benchmark’s tasks

“Novel exploit” is a pretty fancy and generous way of saying that some of the kernels wrote a constant value to the entire output because the evaluation code only tested one set of inputs that can pass if you replace the computation with a memset.