|
|
|
|
|
by nl
157 days ago
|
|
One of the interesting things to me about this is that Codex 5.2 found the most complex of the exploits. The reflects my experience too. Opus 4.5 is my everyday driver - I like using it. But Codex 5.2 with Extra High thinking is just a bit more powerful. Also despite what people say, I don't believe progress in LLM performance is slowing down at all - instead we are having more trouble generating tasks that are hard enough, and the frontier tasks they are failing at or just managing are so complex that most people outside the specialized field aren't interested enough to sit through the explanation. |
|