Hacker News new | ask | show | jobs
by gr71 42 days ago
is the training data for these testcases in benchmark not already there ? how do llms perform in novel complex systems spec design ?