| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by cwillu 500 days ago
	Was it toxicity though as understood by the model, or just a cluster of concepts that you've chosen to label as toxic? I.e., is this something that could (and therefore, will) be turned towards identifying toxic concepts as understood by the chinese or us government, or to identify (say) pro-union concepts so they can be down-weighted in a released model, etc?

1 comments

mayukhdeb 500 days ago

We localized "toxic" neurons by contrasting the activations of each neuron for toxic v/s normal texts. It's a method inspired by old-school neuroscience.

link

immibis 500 days ago

Defining all politics as toxic is concerning, if it's not just a proof of concept. That's something dictatorships do so that people won't speak up.

link