| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by sterlind 26 days ago
	a literal lack of self-awareness, even. I imagine if you asked it what process was using the port, it'd think and realize it was its own, but that kind of reflexive self-awareness (the unprompted kind) is missing. the weaker models will happily kill their own process, even after confirming it belongs to them. the models have a sort of fixation and lack of foreseeable consequences, which reasoning RL has thus far failed to solve (though I see it improving.)

1 comments

kolinko 25 days ago

On the other hand, I found Claude/Opus to be extremely unhelpful when it comes to asking it to benchmark itself with a possible replacement.

It will get "confused", make up numbers, do a ton of other things, and I'm quite sure it is subtly sabotaging the process to show that there is no point replacing it.

I mean, Opus is not perfect, but the amount of "mistakes" it begins to do when you ask it to benchmark itself makes me suspect they are intentional. At least my system/harness.

link

MarkusQ 25 days ago

No, they are always like that.

It's really easy (and tempting) to incorrectly impute all sorts of human motives to these things, but it's no more valid than assuming your Magic 8-Ball is being coy.

link

krapp 25 days ago

You didn't add "never hallucinate or make anything up" to the prompt, rookie mistake.

link