Hacker News new | ask | show | jobs
by Lerc 85 days ago
It's reasonable to test their ability to do this, and it's worth working to make it better.

The issue is that people claim the performance is representative of a human's performance in the same situation. That gives an incorrect overall estimation of ability.