|
|
|
|
|
by iambateman
982 days ago
|
|
This really is a good article, and is seriously researched. But the conclusion in the headline - “AI hype is built on flawed test scores” - feels like a poor summary of the article. It _is_ correct to say that an LLM is not ready to be a medical doctor, even if it can pass the test. But I think a better conclusion is that test scores don’t help us understand LLM capabilities like we think they do. Using a human test for an LLM is like measuring a car’s “muscles” and calling it horsepower. They’re just different. But the AI hype is justified, even if we struggle to measure it. |
|