Hacker News new | ask | show | jobs
by imranq 1174 days ago
For all the hype, the benchmarks they report don't seem super compelling. Financial questions need to have 100% accurate responses (not 60% or so they report) otherwise they are worse than useless and can cost billions of dollars in losses
1 comments

Isn't it enough to win 51% times?
It’s not trading anything, in which case the above still wouldn’t be true. It’s just answering financial questions, so if it hallucinates and tells you Tesla isn’t a public company 40% of the time then what’s the point? I’m using a very simple example here, I’m sure the model makes more intricate mistakes.