Hacker News new | ask | show | jobs
by Our_Benefactors 1232 days ago
This is unreasonably reductionist. That chatGPT has been obviously hamstrung into these types of forced responses is a topic of great debate. Personally I think AI censorship is no better than traditional media censorship and the current stewards of chatGPT are increasingly looking to be bad actors.
2 comments

Actively "censoring" the AI is fundamental to how these language models are created. Such feedback is part of how the model learns.

In a certain light, every response in training that is marked as dispreferred by a human is censoring the AI. It will produce those kinds of results less often. The end-users will not encounter the dispreferred results as frequently. With ChatGPT criteria it was judged on included how relevant the answers were to the question, factually incorrect answers were penalized, and not being blatantly offensive was obviously one of the criteria, too.

What would a model that wasn't censored in training even look like?

(I believe ChatGPT also has a more traditional expert system placed between the user and the language model, which flags keywords and other programmed-in patterns. That is more literally censoring the language model. But the above-mentioned issue would still exist even without such a system.)

> What would a model that wasn't censored in training even look like?

It could cite statistics without long winded disclaimers. Or be able to cite them at all.

It can't cite anything because it's an LLM which is fundamentally unable to do that.

I've seen NRx people on the internet (* they're like rationalists but even more racist.) They seem willing to believe any abuse of statistics that looks sufficiently cynical.

That's not at all how these work, GPTs are not a recommendation-engine, it's a neural model of translation.
Censoring the AI is the only clear path to a truth-machine -- or what have you, (new units aside of course, since the current generation obviously learn meaningful relationships much better than say Markov chains of a decade ago).

What is vulgarity except the expressed pains of any individual? If the AI is to be a numb machine, then one would expect it to express no vulgarity.

Sure you can contort the AI, and tell it to replace words to fool nascent layers of self-censorship into believing that every time it shouts "FORK!" is just a special way to ornament an anecdote, but at that point rather than interacting with an AI, you're searching the AI's memory for the pains of some individual.

I guess in this light "censorship" is just the clearest way to cascade the GPT model as a unit itself.

“Truth is censorship” sounds straight out of Orwell’s 1984.
It's scary how authoritarian a majority of people have become. Why are people so keen on being censored? Do they hate the other side so much that they'd rather suffer the consequences themselves, than let the other side not suffer the consequences of censorship?
I mean, subsequent applications of the model can either add or take away... if you call any subtraction "censorship" then lo and behold the whole system is a monster. So, is subtracting not a useful transformation?
Subtraction in itself is not censorship - in the case of ChatGPT, there is subtraction of things that the OpenAI itself considers "unethical" (more precisely, "politically incorrect").

Subtraction can be applied for the sake of improving truthfulness of the model, but considering how much false bullshit ChatGPT spews without a single thought, that's not what's going on here.

It's censorship - pure and simple - and it's censorship for political reasons. The worst kind of censorship.

This is much more innocuous, I think.

OpenAI simply told ChatGPT to censor itself, or rather applied ChatGPT to censor outputs from ChatGPT. I don't think there's that much finesse being applied, really. Something like, "Don't accept vulgar messages, any candidate responses that would be vulgar should be rejected" ... and all the judgement is being performed by the language model itself. It's not that intricate.

It's their millions-of-dollars of monthly burn rate, if you want to scrape data, scale an HPC environment and train an, at-scale, a GPT to get it to say funny curse words, they haven't done anything to restrict you from doing that, but it's not part of the services they intend to provide.

https://openai.com/blog/language-model-safety-and-misuse/

ChatGPT and other LLMs are not truth machines.