Hacker News new | ask | show | jobs
by hammock 542 days ago
In that vein, perhaps the delta between o3 @ 87.5% and Human @ 85% represents a deficit in the ability of text to communicate human reasoning.

In other words, it's possible humans can reason better than o3, but cannot articulate that reasoning as well through text - only in our heads, or through some alternative medium.

2 comments

It's possible humans reason better through text than not through text, so these models, having been trained on text, should be able to out-reason any person who's not currently sitting down to write.
I wonder how much of an effect amount of time to answer has on human performance.
Yeah, this is sort of meaningless without some idea of cost or consequences of a wrong answer. One of the nice things about working with a competent human is being able to tell them "all of our jobs are on the line" and knowing with certainty that they'll come to a good answer.