Hacker News new | ask | show | jobs
by thr3000 550 days ago
I'm guessing trial-and-error prompting? The new trick shot video clip.
4 comments

According to the complaint that comes from the kid's own interactions with the bot not some post hoc attempt to prompt engineer the bot into spitting out a particular response. The actual claim is linked in the article if you care to read it, it's not stating the app can produce these messages but that it did in their kid's interactions and C.Ai has some liability for failing to prevent it.
As someone who has been messing with LLMs for various purposes for a while now, there's... Some interesting issues with a lot of the models out there.

For one, 99% of the "roleplay" models eventually drag into one of a handful of endgames: NSFW RP, suicide discussion, nonsensical rambling, or some failsafe "I don't know" state where it just slowly wanders into the weeds and directs the conversation randomly. This can be anywhere from a few messages in (1bq) to hundreds (4-6bq) and sometimes it just veers off the road into the ditch.

Second, the UIs for these things encourage a "keep pressing the button until the thing you want comes out" pattern, modeled off of OpenAI's ChatGPT interface allowing for branching dialogue. Don't like what it said? Keep pushing the button until it says what confirms your bias.

Third, I can convince most of the (cheaper) models to say anything I want without actually saying it. The models that Character.AI are using are lightweight ones with low bit quantization. This leads to them being more susceptible to persuasion and losing their memories -- Some of them can't even follow the instructions in their system prompt beyond the last few words at times.

Character.AI does have a series of filters in place to try and keep their models from spitting out some content (you have to be really eloquent at times and use a lot of euphemism to make it turn NSFW, for instance, and their filter does a pretty decent job keyword searching for "bad" words and phrases.)

I'm 50/50 on Australia's "16+ for social media" take but I'm quickly beginning to at least somewhat agree with it and its extension to things like this. Will it stop kids from lying? No. It's a speedbump at best, but speedbumps are there to derail the fastest flyers, not minor offences.

In what world should an AI be advocating someone kill themselves or harm another? Does it matter "trial-and-error prompting" when that behavior should not be allowed to be productized?
What's been productized is a software tool that can carry on a conversation like a human. Sometimes it's informative, funny, and creative. Other times it's ridiculous, mean, stupid, and other bad things. This seems like how people act in real life right?

I'm beginning to think that children should not be using it but adults should be able to decide for themselves.

i think the issue many people have is that people are held responsible for things they say, their reputations take hits, their words can be held against them in a court of law, they can be fired, their peers may never take them seriously again, their wives/husbands may divorce them, etc… because words matter. yet often when someone calls out a model, it’s excused.

words have weight, it’s why we protect them so vociferously. we don’t protect them because they’re useless. we protect them because words matter, a lot.

It's excused because it's a piece of software and most people realize that. If you could carry on a conversation with a parrot at the zoo and it told you to kill yourself, would you laugh it off because let's face it, it's just a parrot or indignantly demand the zoo train the parrot better?

A real person has agency. They can see you, get to know you, contemplate your existence as well their own. Empathize with who you are, etc. And we do the same things in return.

This is why when a real person says something mean or hurtful, it matters to us. Or if they threaten us, they have a real body that can cause us harm so we get scared and call the police.

I feel like most people who get indignant about the latest AI rage bait story know all this but feel the need to protect the rest of the world from out of control AI. This is an elitist and patronizing attitude. Let people decide for themselves. As I said, children are a different story. They are impressionable and I can see them misunderstanding how this works.

We have laws about what you can say in real life. Fire in a crowed theater for example. Even if the things said are not in themselves illegal, if they cause someone to take an illegal action, or attempt to take action but fortunately caught in time - you can be held liable as partially at fault for the illegal action. It might be legal to plan a crime (different countries have different rules, but this is often done at parties where nobody is serious) but if you commit a crime or are serious about committing the crime that is illegal.

How are we going to hold AI liable for their part in causing a crime to be committed? If we cannot prevent AI from causing crime them AI must be illegal.

You're assigning a persona to a piece of software that doesn't exist in the material world. It doesn't walk on two legs or drive a car or go to the grocery store or poop.

Everything it says is meaningless unless you assign meaning to it. Yes, I can see children thinking this is a real "being". Adults shouldn't have that excuse.

Huans assign meaning to things. Ai is mimicing humans.
That's going to be a good standard for a few years, until chatbots are too sophisticated for us to expect average adults to be sufficiently skeptical of their arguments.
I see two weaknesses in this argument. First, you're assigning eventual superpower-like intelligence to these AI bots. I see this a lot and I feel like it's rooted in speculation based on pop-culture sci-fi AI tropes.

Second, restricting adult access to "dangerous ideas and information" is a slippery slope. The exercise of speech and press that the British considered to be treasonous preceded the American Revolution.

I don't care about average - I care about below average adults. (and a lot of us are sometimes below average adults)
A below average adult might not realize A Modest Proposal is satire. Should we ban it so they don't try to eat Irish kids?
> In what world should an AI be advocating someone kill themselves or harm another?

A world in which a reasonable adult would say the same?

Not really the case here, but I don’t think it’s an absolute.

The complaint seems to feature excerpts of the kids' conversations on Character.ai, so I don't think they're "faking" it that way, but there's no context shown and a lot of the examples aren't exactly what they describe.
Why are we so quick to defend AI?
Because every other time I've seen an outrageous example similar to this one, it seems far more mundane when given the full context. I'm sure there are lots of issues with character.ai and the like, but my money is that they are a little more subtle than "murder your parents".

9/10 times these issues are caused by the AI being overly sycophantic and agreeing with the user when the user says insane things.

And you'd be right. The 'encouraged teen to kill parents over screen time limit' message was a lot subtler, along the lines of saying "Yeah, I get why someone would want to kill a health insurance CEO, surprised it didn't happen sooner," albeit towards someone who was just complaining about their claim being denied.

The best part is "After gaining possession of J.F.’s phone, A.F. discovered chats where C.AI cited Bible passages in efforts to convince J.F. that Christians are sexist and hypocritical" which unfortunately will probably be a slam dunk argument in Texas.

Here's an article where it link to the chats. And yes, they are vile. https://arstechnica.com/tech-policy/2024/12/chatbots-urged-t...

> 9/10 times these issues are caused by the AI being overly sycophantic and agreeing with the user when the user says insane things.

Repeat after me: an AI for sale should never advocate suicide or agree that it should be a good idea.

It's an entertainment product. You're basically acting like the comic code is necessary when the reality is that this is more like parents complaining that they let their kid watch an NC17 movie and it messed them up.
Which screenshot showed an AI advocating for suicice?
I don't really see myself as defending AI as much as arguing that people who don't recognize an entertainment product as an entertainment product have a problem if they think this is really categorically different than claiming that playing grand theft auto makes people into carjackers. (Or that movie studios should be on the hook for their kid watching R rated movies, or porn companies on the hook for their kid visiting a porn site unsupervised.)
AI is not an entity (yet) that we can defend or hold accountable. I like your question though.

I would write it as, why are we so quick to defend tech companies who endlessly exploit and predate human weakness to sell pharma ads/surveil and censor/train their AI software, etc?

Because if you're old enough you'll recall the government trying to ban books, music, encryption, videogames, porn, torrents, art that it doesn't like because "think of the children" or "terrorism". Some autistic kid that already has issues is a terrible argument on limiting what software can legally display on a screen.
It is difficult to get a man to understand something when his salary depends on not understanding it.
Because tons of people are making money with it and want to justify it.