| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by mr_toad 979 days ago
	Was is trained to do that, or just hardwired after the training?

2 comments

famouswaffles 979 days ago

We cant really "hardwire" LLMs. We don't have the knowledge to. But essentially you can rate certain types of responses as better and train it to emulate that.

link

lucubratory 979 days ago

I'm not sure what you mean. I'm talking about RLHF, that's how they ensure the machines never attest to having feelings or being sentient. In ML terms, RLHF is training. There are hardwired restraints on output, but that's more for things like detecting copyrighted content that got past training and cutting it.

link