Hacker News new | ask | show | jobs
by int_19h 583 days ago
They also tested GPT-o1, which is always CoT. Yet it is still worse.