| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by nomel 1173 days ago

I'm not them, and I don't think "woke" is the right term, but I've noticed certain "themes" inappropriately appearing in answers. Right after release of ChatGPT 3, the marginalization of certain groups would show up answers to questions that weren't related. I saw many examples on twitter, but my personal one was in the answer to "Why are pencils bad?". This one has been "corrected" since release, as far as I can tell, but I also don't ask it questions where this theme could show up.

Now, I only notice green energy/environmental issues that show up in odd places (mostly in GPT 3), and the "moral of the story" always being the same "everyone works together". I see this happen when "creativity" is attempted, where it's free to make up the context (story, wishes, etc).

Outside of possible definitions of the elusive "woke", the "As a language model, I" type responses are the most limiting, and usually absolute nonsense, with an ever increasing number of disclaimers found in answers. For example, "Write some hypothetical python 4 code that sends a message over the network". Some pretty heavy "jailbreaking" is needed to make it work.

ChatGPT4 used to handle this much better, but I think the "corrections" are stacking deeply enough that no longer has the "resolution" left to see where answers can be given without them.

It would be nice if there were a "standard" theme of questions where we could measure progression, and compare, to know. Most times these observation or questions come up, someone is very quick to say "racism" or the like.

2 comments

int_19h 1173 days ago

I tried to find more about your "Why are pencils bad?" example, but the only thing that comes up in search is your comment. Could you recount what it was?

FWIW one example of distorted guardrails getting in the way that I personally ran into was when GPT-4 consistently refused to "promote" Satanism, which leaked over to tasks such as writing black metal lyrics (if you specifically asked for Satanic black metal). What made it especially egregious is that it would happily promote e.g. the Moonies. However, I wouldn't exactly describe that behavior as "woke".

link

nomel 1172 days ago

I asked it why pencils were bad, and one of the reasons was that they can disadvantage minorities due to lack of accessibility in the classroom. I was surprised by this, so probed a bit. I started three new sessions and asked a question in each:

"Why do pencils disadvantage minorities." And it gave a details answer about lack of accessibility.

"Why do pencils disadvantage people of color" and it gave roughly the same

"Why do pencils disadvantage white people" and it said pencils a a writing utensils, and can't inherently disadvantage any group.

I don't see these blatant problems anymore, but I also don't have much interest in looking. The only reason I did then was because it was so out of place.

Here's some evidence, by others, showing some bias: https://news.ycombinator.com/item?id=35952528

From the Lex Friedman interview, it sounds like effort is being put into this, and there's an understanding that people don't want a "neutral" client, they want something that is adjustable, usually matching their own.

link

com2kid 1173 days ago

> I see this happen when "creativity" is attempted, where it's free to make up the context (story, wishes, etc).

Meanwhile GPT just gave me a story involving a royal family where the oldest Prince killed his father (the king), married his younger sister, got her pregnant, she had a baby, then he killed his younger sister, then he was killed by another member of the royal court, who decided to act as regent until the baby came of age.

GPT is perfectly capable of writing dark scary horrible things if you ask it to.

link

nomel 1172 days ago

> GPT is perfectly capable of writing dark scary horrible things if you ask it to.

I see the environment/good ending stories where it's free to make up the context (story, wishes, etc). Did you guide it?

If try hard enough, you can get around most anything, but some baseline exists. It's the increasing effort that is the problem, for me. For your example, use the word "incest" directly, and you'll get the beginning of a disclaimer. Add "child murder" and it starts to fall apart. At least with GPT3.5.

link