| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Cynddl 108 days ago
	> “These are not isolated incidents. They are symptoms of a systemic problem: the benchmarks we rely on to measure AI capability are themselves vulnerable to the very capabilities they claim to measure.” As a researcher in the same field, hard to trust other researchers who put out webpages that appear to be entirely AI-generated. I appreciate it takes time to write a blog post after doing a paper, but sometimes I'd prefer just a link to the paper.