Hacker News new | ask | show | jobs
by shubham13596 303 days ago
Yes I think having an LLM generate such OnlyConnect style questions (with the right prompting) should solve for the problem this benchmark seems to have of LLMs, most likely, being trained on past years OnlyConnect questions.