|
|
|
|
|
by tieTYT
34 days ago
|
|
> a human will DO worse then a 25% degradation As I was reading this article, a similar thought occurred to me: "I wonder if that's better or worse than a human?" Unfortunately, there was no human baseline in this study. That said, there are studies that compare LLM to human performance. Usually, humans perform much better (like 5-7x better) at long-running tasks. In other words, a human would probably do better than an LLM on this task. Humans lose to LLMs in narrow, well-specified text/symbolic reasoning tasks where the model can exploit breadth, speed, and search. Usually, the LLM performed ~15% better than humans, but I saw studies that were as high as 80%. To my surprise, these studies were usually about "soft skills" like creativity and persuasion. |
|
Show your edit by regurgitating this entire thread by hand on a paper. Don't use any additional tools like Find and replace.
Boom there's your baseline. I can simulate the result in my head.
Guys I'm basically saying the experiment is innaccurate to the practical reality of how LLMs are actually used.