|
|
|
|
|
by machiaweliczny
501 days ago
|
|
This is very bearish for current AI. Seems like 99% reliability is still too small with compounding errors. But I wonder of this is inherently specific to longer context or if this just depends on how it’s trained. In theory longer context => more errors Although I think people are the same, too big problem and you are getting lost unless taking it in bites, so seems like OpenAI implementation is just bad because o3 hallucination benchmark shouldn’t lead to such poor performance |
|