| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by binyu 46 days ago
	They all still fall short of Opus 4.6, definitely though. They are good but fail on extremely complex tasks, in contrast with a frontier model that will keep on trying until it succeeds or exhausts the solutions space.

2 comments

julianlam 46 days ago

Not by much, and moving goalposts makes for a bad comparison. Local open weight models are already more powerful than frontier models from only a year back.

If you believe what you read here, the gap is closing fast.

link

segmondy 45 days ago

frontier models don't keep trying until they succeed. that's a harness problem and best believe it, the best harness are private and not public.

link

binyu 45 days ago

It is much more of a context window size and model capabilities problem. Local models are not even remotely close in solving complex problems, even when used with the same harness.

link