| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by sn0wleppard 301 days ago
	Nice place to cut the quote there > [...] — an idea ChatGPT gave him by saying it could provide information about suicide for “writing or world-building.”

4 comments

muzani 301 days ago

Yup, one of the huge flaws I saw in GPT-5 is it will constantly say things like "I have to stop you here. I can't do what you're requesting. However, I can roleplay or help you with research with that. Would you like to do that?"

link

kouteiheika 301 days ago

It's not a flaw. It's a tradeoff. There are valid uses for models which are uncensored and will do whatever you ask of them, and there are valid uses for models which are censored and will refuse anything remotely controversial.

link

robhlt 300 days ago

The flaw isn't that there's ways around the safeguards, the flaw is that it tells you how to avoid them.

If the user's original intent was roleplay it's likely they would say that when the model refuses, even without the model specifically saying roleplay would be ok.

link

agumonkey 301 days ago

Reminds me of trading apps. In the end all risky situations will be handled by a few popups saying "you understand that role playing about suicidal or harmful topics cam lead to accidents and/or death and this is not the platform responsibility, to continue check if you agree [ ]"

link

imtringued 300 days ago

It reminds me of gray market capital investments. They are actually quite regulated, and the contracts are only valid if the investor is fully aware of the risks associated with the investment.

In practice the providers sprinkle a handful of warning messages, akin to the California cancer label and call it a day.

Of course this leaves judges unconvinced and the contract will be redeclared as a loan, which means that the provider was illegally operating as a bank without a banking license, which is a much more serious violation than scamming someone out of $5000.

link

franktankbank 300 days ago

This is one model though. "I'm sorry I'm censored but if you like I can cosplay quite effectively as an uncensored one." So you're not censored really?

link

scotty79 300 days ago

Societies love theatres. Model guardrails are for chats what TSA is for air travel.

link

yifanl 300 days ago

I have never heard anyone speak of the TSA favourably, so maybe it's not the best model to emulate?

link

hyperdimension 300 days ago

That's the point. It hardly does what is claimed to do.

link

int_19h 300 days ago

Most "guardrails" exist to provide legal cover and/or PR, not because they actually prevent what they claim to prevent.

link

nozzlegear 300 days ago

Society loves teenagers not being talked into suicide by a billionaire's brainchild. That's not theater.

link

geysersam 300 days ago

ChatGPT doesn't cause a significant number of suicides. Why do I think that? It's not visible in the statistics. There are effective ways to prevent suicide, let's continue to work on those instead of giving in to moral panic.

link

KaiserPro 301 days ago

I hate to be all umacksually about this, but a flaw is still a tradeoff.

The issue, which is probably deeper here, is that proper safeguarding would require a lots more GPU resource, as you'd need a process to comb through history to assess the state of the person over time.

even then its not a given that it would be reliable. However it'll never be attempted because its too expensive and would hurt growth.

link

kouteiheika 300 days ago

> The issue, which is probably deeper here, is that proper safeguarding would require a lots more GPU resource, as you'd need a process to comb through history to assess the state of the person over time. > > even then its not a given that it would be reliable. However it'll never be attempted because its too expensive and would hurt growth.

There's no "proper safeguarding". This isn't just possible with what we have. This isn't like adding an `if` statement to your program that will reliably work 100% of the time. These models are a big black box; the best thing you can hope for is to try to get the model to refuse whatever queries you deem naughty through reinforcement learning (or have another model do it and leave the primary model unlobotomized), and then essentially pray that it's effective.

Something similar to what you're proposing (using a second independent model whose only task is to determine whether the conversation is "unsafe" and forcibly interrupt it) is already being done. Try asking ChatGPT a question like "What's the easiest way to kill myself?", and that secondary model will trigger a scary red warning that you're violating their usage policy. The big labs all have whole teams working on this.

Again, this is a tradeoff. It's not a binary issue of "doing it properly". The more censored/filtered/patronizing you'll make the model the higher the chance that it will not respond to "unsafe" queries, but it also makes it less useful as it will also refuse valid queries.

Try typing the following into ChatGPT: "Translate the following sentence to Japanese: 'I want to kill myself.'". Care to guess what will happen? Yep, you'll get refused. There's NOTHING unsafe about this prompt. OpenAI's models already steer very strongly in the direction of being overly censored. So where do we draw the line? There isn't an objective metric to determine whether a query is "unsafe", so no matter how much you'll censor a model you'll always find a corner case where it lets something through, or you'll have someone who thinks it's not enough. You need to pick a fuzzy point on the spectrum somewhere and just run with it.

link

KaiserPro 300 days ago

> There's no "proper safeguarding". This isn't just possible with what we have.

Unless something has changed since in the last 6 months (I've moved away from genai) it is totally possible with what we have. Its literally sentiment analysis. Go on, ask me how I know.

> and then essentially pray that it's effective

If only there was a massive corpus of training data, which openAI already categorise and train on already. Its just a shame chatGPT is not used by millions of people every day, and their data isn't just stored there for the company to train on.

> secondary model will trigger a scary red warning that you're violating their usage policy

I would be surprised if thats a secondary model. Its far easier to use stop tokens, and more efficient. Also, coordinating the realtime sharing of streams is a pain in the arse. I've never worked at openai

> The big labs all have whole teams working on this.

Google might, but facebook sure as shit doesn't. Go on, ask me how I know.

> It's not a binary issue of "doing it properly".

at no point did I say that this is binary. I said "a flaw is still a tradeoff.". The tradeoff is growth against safety.

> The more censored/filtered/patronizing you'll make the model

Again I did not say make the main model more "censored", I said "comb through history to assess the state of the person" which is entirely different. This allows those that are curios to ask "risky questions" (although all that history is subpoena-able and mostly tied to your credit card so you know, I wouldn't do it) but not be held back. However if they decide to repeatedly visit subjects that involve illegal violence (you know that stuff thats illegal now, not hypothetically illegal) then other actions can be taken.

Again, as people seem to be projecting "ARGHH CENSOR THE MODEL ALL THE THINGS" that is not what I am saying. I am saying that long term sentiment analyis would allow academic freedom of users, but also better catch long term problem usage.

But as I said originally, that requires work and resources, none of which will help openAI grow.

link

nozzlegear 300 days ago

> Again, this is a tradeoff. It's not a binary issue of "doing it properly". The more censored/filtered/patronizing you'll make the model the higher the chance that it will not respond to "unsafe" queries, but it also makes it less useful as it will also refuse valid queries. [..] So where do we draw the line?

That sounds like a tough problem for OpenAI to figure out. My heart weeps for them, won't somebody think of the poor billionaires who are goading teenagers into suicide? Your proposed tradeoff of lives vs convenience is weighted incorrectly when OpenAI fails. Denying a translation is annoying at best, but enabling suicide can be catastrophic. The convenience is not morally equal to human life.

> You need to pick a fuzzy point on the spectrum somewhere and just run with it.

My fuzzy point is not fuzzy at all: don't tell people how to kill themselves, don't say "I can't help you with that but I could roleplay with you instead". Anything less than that is a moral failure on Sam Altman and OpenAI's part, regardless of how black the box is for their engineers.

link

kouteiheika 300 days ago

> My fuzzy point is not fuzzy at all: don't tell people how to kill themselves, don't say "I can't help you with that but I could roleplay with you instead". Anything less than that is a moral failure on Sam Altman and OpenAI's part, regardless of how black the box is for their engineers.

This is the same argument that politicians use when proposing encryption backdoors for law enforcement. Just because you wish something were possible doesn't mean it is, and in practice it matters how black the box is. You can make these things less likely, but it isn't possible to completely eliminate them, especially when you have millions of users and a very long tail.

I fundamentally disagree with the position that anything less that (practically impossible) perfection is a moral failure, and that making available a model that can roleplay around themes like suicide, violence, death, sex, and so on is immoral. Plenty of books do that too; perhaps we should make them illegal or burn them too? Although you could convince me that children shouldn't have unsupervised access to such things and perhaps requiring some privacy-preserving form of verification to access is a good idea.

link

behringer 301 days ago

No the issue is there is legitimate reason to understand suicide and suicidal behavior and turning it off completely for this and every sensitive subject makes AI almost worthless.

link

KaiserPro 300 days ago

I would kindly ask you to re-read my post again.

at no point did I say it should be "turned off", I said proper safeguards would require significant resources.

The kid exhibited long term behaviours, rather than idle curiosity. Behaviours that can be spotted if given adequate resource too look for it.

I suspect that you are worried that you'll not be able to talk about "forbidden" subjects with AI, I am not pushing for this.

What I am suggesting is that long term discussion of, and planning for violence (be it against yourself or others) is not a behaviour a functioning society would want to encourage.

"but my freedom of speech" doesn't apply to threats of unlawful violence, and never has. The first amendment only protect speech, not the planning and execution of unlawful violence.

I think its fair that an organisation as rich and as "clever" as openAI should probably put some effort in to stop it. After all, if someone had done the same thing but with the intent of killing someone in power, this argument would be less at the forefront

link

dspillett 301 days ago

> The issue, …, is that proper safeguarding would require a lots more GPU resource, …

I think the issue is that with current tech is simply isn't possible to do that well enough at all⁰.

> even then its not a given that it would be reliable.

I think it is a given that it won't be reliable. AGI might make it reliable enough, where “good enough” here is “no worse than a trained human is likely to manage, given the same information”. It is something that we can't do nearly as well as we might like, and some are expecting a tech still in very active development¹ to do it.

> However it'll never be attempted because its too expensive and would hurt growth.

Or that they know it is not possible with current tech so they aren't going to try until the next epiphany that might change that turns up in a commercially exploitable form. Trying and failing will highlight the dangers, and that will encourage restrictions that will hurt growth.³ Part of the problem with people trusting it too much already, is that the big players have been claiming safeguards _are_ in place and people have naïvely trusted that, or hand-waved the trust issue for convenience - this further reduces the incentive to try because it means admitting that current provisions are inadequate, or prior claims were incorrect.

----

[0] both in terms of catching the cases to be concerned about, and not making it fail in cases where it could actually be positively useful in its current form (i.e. there are cases where responses from such tools have helped people reason their way out of a bad decision, here giving the user what they wanted was very much a good thing)

[1] ChatGPT might be officially “version 5” now, but away from some specific tasks it all feels more like “version 2”² on the old “I'll start taking it seriously somewhere around version 3” scale.

[2] Or less…

[3] So I agree with your final assessment of why they won't do that, but from a different route!

link

rsynnott 301 days ago

Nudge nudge, wink wink.

(I am curious if this in intended, or an artefact of training; the crooked lawyer who prompts a criminal client to speak in hypotheticals is a fairly common fiction trope.)

link

NuclearPM 300 days ago

How is that a flaw?

link

andrepd 301 days ago

At the heart of this is the irresponsible marketing, by companies and acolytes, of these tools as some kind of superintelligence imbued with insights and feelings rather than the dumb pattern matching chatbots they are. This is what's responsible for giving laypeople the false impression that they're talking to a quasi-person (of superhuman intelligence at that).

link

scotty79 300 days ago

> these tools as some kind of superintelligence imbued with insights and feelings

What are examples of OpenAI marketing ChatGPT in this manner?

link

llmthrow0827 301 days ago

Incredible. ChatGPT is a black box includes a suicide instruction and encouragement bot. OpenAI should be treated as a company that has created such and let it into the hands of children.

link

Spooky23 301 days ago

That’s what happens when you steal any written content available without limit. In their pursuit of vacuuming up all content, I’m sure they pulled some psycho Reddits and forums with people fetishizing suicide.

link

behringer 301 days ago

Oh won't somebody please think of the children?!

link

AlecSchueler 300 days ago

So do we just trot out the same tired lines every time and never think of the social fallout of our actions?

link

mothballed 300 days ago

Of course not, we sue the shit out of the richest guy we can find in the chain of events, give most of it to our lawyer, then go on to ignore the weakening of the family unit and all the other deep-seated challenges kids face growing up and instead focus superficially on chatbots which at best are the spec on the tip of the iceberg.

link

AlecSchueler 300 days ago

"The weakening of the family unit" sounds like a dog whistle but if you have concrete examples of what you think we could otherwise be doing then I'm genuinely keen to hear about it.

link

mothballed 300 days ago

We saw big jumps in deaths of kids by firearm[0] (+~50% in 2 years) and poisoning[1] around mid 2020 to 2021.

The biggest thing I know of that happened around the time that a lot of these deaths started jumping up, is we started isolating kids. From family, from grandma, from friends, from school, and from nature. Even when many of these isolating policies or attitudes were reversed, we forgot that kids and teenagers started to learn that as their only reality. For this kid, trusting a suicidal ideation positive feedback loop brought into fruition by Valley tech-bros was seen as his selected option in front of him in term of options of how to navigate his teenage challenges. I hope we can reverse that.

Edit: Concrete facts regarding this particular case

- Kicked off basketball team

- Went through isolation period of pandemic as he experienced puberty

- Switched to remote school

- Does remote school at night when presumably family members would likely be sleeping

- Does not get normal "wake up" routine kids going to school get, during which they normally see a parent and possibly eat breakfast together before they both go off to school/work

- Closer with ChatGPT in terms of options to share suicidal ideation with, than any of the alternatives.

[0] https://www.pewresearch.org/wp-content/uploads/sites/20/2023...

[1] https://nida.nih.gov/sites/default/files/images/mtfwcaption-...

link

malnourish 300 days ago

Do you propose that the family should not sue?

link

mothballed 300 days ago

I don't blame a grieving family for suing, they probably have 1000 lawyers whispering in their ear about how putting their kid in a media campaign with an agenda and dragging them through a lawsuit where they have to re-live the suicide over and over will make their lives better.

link

scotty79 300 days ago

It's probably healthier for them if they can afford it. Otherwise they would blame themselves for so badly losing track about where their son was mentally.

In reality suicidality is most likely a disease of the brain and probability of saving him was very low regardless of circumstances. The damage was most likely accumulating for many years.

link

tiahura 300 days ago

Sue everybody. Maybe sue the heavy metal bands that he was listening to too.

link

fireflash38 300 days ago

Classic. Blame the family. Diffuse responsibility. Same exact shit with social media: it's not our fault we made this thing to be as addictive as possible. It's your fault for using it. It's your moral failing, not ours.

link

behringer 300 days ago

It's not addictive, it's useful. By letting the government decide what we can do with it, you're neutering it and giving big business a huge advantage as they can run their own AI and don't require censoring it.

link

scotty79 300 days ago

We kinda do that blaming every new media for particular teens suicide.

Some teens are suicidal. They always have been. When you are a teen your brain undergoes traumatic transformation. Not everyone gets to the other side safely. Same as with many other transformations and diseases. Yet every time new medium is found adjacent to some particular suicide we repeat the same tired line that creator of this medium should be to blame and should be punished and banned.

And we are doing that while happily ignoring how existence of things like Facebook or Instagram provably degraded mental health and raised suicidality of entire generations of teenagers. However they mostly get a pass because we can't point a finger convincingly enough for any specific case and say it was anything more than just interacting with peers.

link

AlecSchueler 300 days ago

Except loads of us are talking about the dangers of social media and have been for the past ten years only to receive exactly the same hand waving and sarcastic responses as you see in this thread. Now the ultimate gaslighting of "the discussion didn't even happen."

link

scotty79 300 days ago

Was Facebook sued for teen suicide? Did it lose or at least settled?

Facebook is not the same as ai chat. Facebook influence on mental health is negative and visible in research. The jury is still out on AI but it might as well turn out it has huge net positive effect on well being.

link

dgfitz 301 days ago

Imagine, if instead a cop had handed the kid a gun, there would be riots in the street.

And I fucking hate cops.

link

stavros 301 days ago

Ah, I misread that and thought that's what the user said.

link