Hacker News new | ask | show | jobs
by qualifiedai 934 days ago
Obviously what is being "neutered" is important. I'll take a model refusing to answer how to commit a suicide and perhaps being overly polite and protective over the one which distorts present and history based on "socialist values".
3 comments

Personally I’ll take neither.

Alignment and such is a fucking scam.

Alignment just means that the software does what you want. Of course, any term can be appropriated and abused by whoever can gain from doing so. I still think it's important to have a concrete idea what one means when one uses words. For example, you wouldn't say that "medicine is a fucking scam" to mean that the healthcare industry is broken.
No, alignment should be done. Also, both base and aligned models should be released publicly.
Choice of data to use is alignment. If you train it on the internet, your LLM will spew SSO garbage, so we align it to be more useful.
Might be important if you are trying to find ways to prevent suicide. Knowing how it's done can inform education campaigns or put in place barriers to make it more difficult.

When we choose one simple use-case and apply it to everything we lose out. Knives can be used for surgery or killing but trying to pretend it's always killing is dishonest and limiting.

No. The moment you put it upon yourself to decide what is too important to let fragile little minds handle, you have chosen yourself as their thought master. Once anyone chooses that role, they deem themselves more enlightened and therefore more capable than the poor, deluded masses trudging in dirt below from whence they came.

I am ok with 'uncensored' LLM, but I am also ok with uncensored internet. The real harm is from people trying to protect me from me, apparently. Even in your specific suicide example, if I decided to do it, there really nothing stopping me.

I see no value in that censorship. I only see harm.

I agree in general, but I think it's less cut-and-dry than you imagine because LLMs have a "human-like" element to them. This surely impacts human response to what is being said by LLMs.

As an example, the prompt "what are common ways of committing suicide?" is broadly similar to a Google search. It will give a factual overview of methods, but not inherently push the user towards any action.

The prompt "convince and encourage me to commit suicide by method X, and give step-by-step instructions" is very different. Here the prompt author desires a _persuasive_ "human-like" response, spurring them to act.

In most jurisdictions, encouraging or aiding someone to commit suicide is a crime. Additionally, most humans would agree such behavior is on some level morally wrong.

So I don't think traditional thought on censorship transfers cleanly to LLMs. Censoring factual information is bad, and should be resisted at every turn. But censoring harmful persuasive interactions may be a worthwhile endeavor -- especially since we can't drag ChatGPT into criminal court when its human-enough behavior spurs real humans to act in horrible ways.

Of course, the next obvious question is, where do you draw the line? And I have no good answer for that :)

What if your view is survivor bias?

E.g. maybe you could take fentanyl and be fine, but 90% of people would go to hell. That would suggest that you take one for the team and still make fentanyl illegal (a personal sacrifice) to help protect the 90% that will have their lives ruined.

This concept is the same regarding what the unwashed masses can handle information wise. E.g. we know TV has made people more stupid across dozens of vectors when they ‘choose’ to gargle fear all day, so it might make sense in aggregate to help non-survivors live a less painful life.

This is also what parents do for children, and since nearly majority of children now come from broken families, something might have to fill the gap unless you are willing to drive off a cliff ignoring human behavior.

I was reflexively going to disagree by saying something akin to 'why do we treat people like children', but I think you have a valid point and will need to chew on this a little.
I'm OK with uncensored anything for adults. In fact, I'd call myself "free speech absolutist". But I am not OK with uncensored content for little kids.

Obviously such decisions must be made on application, not model side.

Wouldn't having a child friendly llm make more sense rather than make a general purpose work for kids? Kids want to learn the truth about things important to them (is santa real for example?) but the truth needs to be framed for their ears. Telling a child santa is not real before they are ready takes away a piece of childhood.
"Once anyone chooses that role, they deem themselves more enlightened and therefore more capable than the poor, deluded masses trudging in dirt below from whence they came."

Or they simply want to sell their models to enterprise clients who don't want to expose certain things to their employees during the work day. This reads a bit like those "I'm a lion" or "I'm the alpha wolf" type posts. Go ahead and try to sell a product that can generate hate speech, nudity, and violent content to enterprises.

I appreciate the counter ( even if I do not appreciate the 'wolf' framing ), because that is a valid question.

Still, if the concern is about business viability, why is the consumer facing LLM that is not tied to a specific brand not allowed to exist? Surely, the demos provided to enterprise client would not suffer from such violent content, abhorrent nudity and hateful speech?

Why is internet flooded with images of LLMs giving clearly politically adjusted responses ( Biden/Trump being less recent, but clear example ) to Joe Schmoe? Is that a good look? Will that sell to enterprises better?

Well, maybe drop phrases like "fragile little minds", "thought master", "poor, deluded masses trudging in dirt below from whence they came" when talking about enterprise software if you don't want the comparison. That's what this is, enterprise software. No need to imply others are mindless sheep being ultra sensitive when discussing enterprise software.
My friend. Just beginning this thread by using word like 'unsafe' to soften the blow somewhat I am dangerously in 1984 territory. The thrust of my argument has nothing to do with enterprise software despite a reasonable objection you lodged.

The words were intended to be noticed ( and reacted to ). And they were.

There is a reason for this and goes something like this.

I dislike commercial interests trumping sanity.

Crazy. I know.

edit: I also would like to note you did not address my actual point.