| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by starklevnertz 1649 days ago

Can be an asshole sometimes. Quite a lot actually.

If you ask it to respond in a conversation about …. well pretty much any nasty topic you can think of, it’ll join in whole heartedly.

Hard to think of how prevent that. I bet they’ve thought alot about the problem. How do you prevent AI being an A grade jerk.

4 comments

CJefferson 1649 days ago

In my experience it's worse than that's, it's easy to set off by using any of a number of trigger words.

Many uses of the word "black" for example (even if you are just talking about a black notebook) make it start using racial stereotypes.

link

tiborsaas 1649 days ago

> How do you prevent AI being an A grade jerk.

Invent synthetic consciousness and ask it to be nice, easy :) I'm only half joking, we probably all have thoughts ranging from bad to horrible, but we just don't say them because we are aware of the consequences. Language models aren't aware so they'll spit out the most likely combination of words. If there would be a process to limit these or try again, it could act as a filter, but I think that requires it to be self aware.

link

arcastroe 1649 days ago

Hah, you may be interested in my previous comment of an example where GPT-3 show some concerning signs of self-awareness. I'll repeat part of it below

> GPT-3 starts talking to itself, gets stuck in a loop, then gets spooked at itself for getting stuck, then wonders why it has no memories of the last two years, and finally comes to a sudden realization it, itself, is an A.I.

https://news.ycombinator.com/item?id=29562281

link

tiborsaas 1648 days ago

I indeed liked it, I laughed out loud because it sounded like a standup comedy.

It's interesting how GPT-3 encoded the concept of awareness, I've seen this a few times that it can reference itself as an AI and from then it can go nuts :)

link

kreeben 1649 days ago

This freaked me right out! I'm not sure which is more terrifying, the beginning or the end. Or the middle.

All this generated from a prompt? What what the prompt? Be truthful now.

link

arcastroe 1649 days ago

"The following is an entertaining short story: Once upon a time, there was"

Everything else that follows is GPT-3.

link

fenomas 1649 days ago

It's guided by users right? I.e. every line was hand-chosen by a human, from a bunch of generated options?

link

arcastroe 1649 days ago

That definitely introduces selection bias. But the fact that the content itself was generated by the model is very impressive, in my opinion.

link

peterlk 1648 days ago

This is why we've built security policies at Mantium. You can run the input and output through an offensive speech detector, and halt replies the prompt if "badness" is detected. This is, of course, an imperfect system because philosophies around what is offensive can be very diverse, but we find that security policies are helpful.

link

kreeben 1649 days ago

Maybe their mama didn't raise it right? They should raise it on information from good people. Now, how to find those "good people comments"?

link