| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by mbowcut2 276 days ago
	I'm not surprised. People really thought the models just kept getting better and better?

3 comments

segmondy 276 days ago

The models are getting better and better.

giveita 276 days ago

That's expected. No one will release a worse model.

sodality2 276 days ago

Not a cheaper one, or better in some ways, or lower latency, etc?

giveita 276 days ago

They do that too but right now it is an arms race as well.

guerrilla 276 days ago

Maybe. How would I know?

jMyles 276 days ago

...even if the agent did "cheat", I think that having the capacity to figure out that it was being evaluated, find the repo containing the logic of that evaluation, and find the expected solution to the problem it faced... is "better" than anything that the models were able to do a couple years ago.