| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by azeemh 1381 days ago
	it's not metacognitive otherwise it would know this is an exploit and it would have a sense of a self it seeks to preserve.

3 comments

drewbeck 1381 days ago

Yeah this seems like how we get sentience, in a cool sci fi way. Teach them to care for themselves, as a security measure.

link

visarga 1381 days ago

How would it know what is our intention? There are plenty of quirky examples in the training set, it could be imitating any of them especially if T=high. What we need to do is to ask the model to review its answer by our criteria, it can't read minds, we have to tell it.

link

mort96 1380 days ago

I don't understand how metacognition implies either self-awareness or self-preservation.

link