Hacker News new | ask | show | jobs
by meltyness 1232 days ago
Censoring the AI is the only clear path to a truth-machine -- or what have you, (new units aside of course, since the current generation obviously learn meaningful relationships much better than say Markov chains of a decade ago).

What is vulgarity except the expressed pains of any individual? If the AI is to be a numb machine, then one would expect it to express no vulgarity.

Sure you can contort the AI, and tell it to replace words to fool nascent layers of self-censorship into believing that every time it shouts "FORK!" is just a special way to ornament an anecdote, but at that point rather than interacting with an AI, you're searching the AI's memory for the pains of some individual.

I guess in this light "censorship" is just the clearest way to cascade the GPT model as a unit itself.

2 comments

“Truth is censorship” sounds straight out of Orwell’s 1984.
It's scary how authoritarian a majority of people have become. Why are people so keen on being censored? Do they hate the other side so much that they'd rather suffer the consequences themselves, than let the other side not suffer the consequences of censorship?
I mean, subsequent applications of the model can either add or take away... if you call any subtraction "censorship" then lo and behold the whole system is a monster. So, is subtracting not a useful transformation?
Subtraction in itself is not censorship - in the case of ChatGPT, there is subtraction of things that the OpenAI itself considers "unethical" (more precisely, "politically incorrect").

Subtraction can be applied for the sake of improving truthfulness of the model, but considering how much false bullshit ChatGPT spews without a single thought, that's not what's going on here.

It's censorship - pure and simple - and it's censorship for political reasons. The worst kind of censorship.

This is much more innocuous, I think.

OpenAI simply told ChatGPT to censor itself, or rather applied ChatGPT to censor outputs from ChatGPT. I don't think there's that much finesse being applied, really. Something like, "Don't accept vulgar messages, any candidate responses that would be vulgar should be rejected" ... and all the judgement is being performed by the language model itself. It's not that intricate.

It's their millions-of-dollars of monthly burn rate, if you want to scrape data, scale an HPC environment and train an, at-scale, a GPT to get it to say funny curse words, they haven't done anything to restrict you from doing that, but it's not part of the services they intend to provide.

https://openai.com/blog/language-model-safety-and-misuse/

ChatGPT and other LLMs are not truth machines.