| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by url00 214 days ago
	I don't want a more conversational GPT. I want the _exact_ opposite. I want a tool with the upper limit of "conversation" being something like LCARS from Star Trek. This is quite disappointing as a current ChatGPT subscriber.

18 comments

tekacs 214 days ago

That's what the personality selector is for: you can just pick 'Efficient' (formerly Robot) and it does a good job of answering tersely?

https://share.cleanshot.com/9kBDGs7Q

link

pants2 214 days ago

FWIW I didn't like the Robot / Efficient mode because it would give very short answers without much explanation or background. "Nerdy" seems to be the best, except with GPT-5 instant it's extremely cringy like "I'm putting my nerd hat on - since you're a software engineer I'll make sure to give you the geeky details about making rice."

"Low" thinking is typically the sweet spot for me - way smarter than instant with barely a delay.

link

gnat 214 days ago

I hate its acknowledgement of its personality prompt. Try having a series of back and forth and each response is like “got it, keeping it short and professional. Yes, there are only seven deadly sins.” You get more prompt performance than answer.

link

sheepscreek 214 days ago

I like the term prompt performance; I am definitely going to use it:

> prompt performance (n.)

> the behaviour of a language model in which it conspicuously showcases or exaggerates how well it is following a given instruction or persona, drawing attention to its own effort rather than simply producing the requested output.

link

jjcob 213 days ago

Might be a result of using LLMs to evaluate the output of other LLMs.

LLMs probably get higher scores if they explicitly state that they are following instructions...

link

resfirestar 213 days ago

It's like writing an essay for a standardized test, as opposed to one for a college course or for a general audience. When taking a test, you only care about the evaluation of a single grader hurrying to get through a pile of essays, so you should usually attempt to structure your essay to match the format of the scoring rubric. Doing this on an essay for a general audience would make it boring, and doing it in your college course might annoy your professor. Hopefully instruction-following evaluations don't look too much like test grading, but this kind of behavior would make some sense if they do.

link

siva7 213 days ago

That's the equivalent of a performative male, so better call it performative model behaviour.

link

cma 213 days ago

Pay people $1 and hour and ask them to choose A or B, which is more short and professional:

A) Keeping it short and professional. Yes, there are only seven deadly sins

B) Yes, there are only seven deadly sins

Also have all the workers know they are being evaluated against each other and if they diverge from the majority choice their reliability score may go down and they may get fired. You end up with some evaluations answered as a Keynesian beauty contest/family feud survey says style guess instead of their true evaluation.

link

totallymike 213 days ago

I can’t tell if you’re being satirical or not…

link

cma 213 days ago

https://time.com/6247678

link

jdelman 214 days ago

This is even worse on voice mode. It's unusable for me now.

link

op00to 213 days ago

I use Efficient or robot or whatever. It gives me a bit of sass from time to time when I subconsciously nudge it into taking a “stand” on something, but otherwise it’s very usable compared to the obsequious base behavior.

link

kivle 214 days ago

If only that worked for conversation mode as well. At least for me, and especially when it answers me in Norwegian, it will start off with all sorts of platitudes and whole sentences repeating exactly what I just asked. "Oh, so you want to do x, huh? Here is answer for x". It's very annoying. I just want a robot to answer my question, thanks.

link

withinboredom 213 days ago

At least it gives you an answer. It usually just restates the problem for me and then ends with “so let’s work through it together!” Like, wtf.

link

qwertytyyuu 213 days ago

repeating what is being asked is fine i think, sometimes is thinks you want something different to what you actually want. what is annoying is "that's and incredibly insightul question that delves into a fundamental..." type responses at the start.

link

layer8 214 days ago

At least for the Thinking model it's often still a bit long-winded.

link

bogtog 214 days ago

Unfortunately, I also don't want other people to interact with a sycophantic robot friend, yet my picker only applies to my conversation

link

Leynos 214 days ago

Hey, you leave my sycophantic robot friend alone.

link

coolestguy 214 days ago

Sorry that you can't control other peoples lives & wants

link

DonaldPShimoda 214 days ago

This is like arguing that we shouldn't try to regulate drugs because some people might "want" the heroin that ruins their lives.

The existing "personalities" of LLMs are dangerous, full stop. They are trained to generate text with an air of authority and to tend to agree with anything you tell them. It is irresponsible to allow this to continue while not at least deliberately improving education around their use. This is why we're seeing people "falling in love" with LLMs, or seeking mental health assistance from LLMs that they are unqualified to render, or plotting attacks on other people that LLMs are not sufficiently prepared to detect and thwart, and so on. I think it's a terrible position to take to argue that we should allow this behavior (and training) to continue unrestrained because some people might "want" it.

link

simonw 214 days ago

What's your proposed solution here? Are you calling for legislation that controls the personality of LLMs made available to the public?

link

bogtog 214 days ago

There aren't many major labs, and they each claim to want AI to benefit humanity. They cannot entirely control how others use their APIs, but I would like their mainline chatbots to not be overly sycophantic and generally to not try and foster human-AI friendships. I can't imagine any realistic legislation, but it would be nice if the few labs just did this on their own accord (or were at least shamed more for not doing so)

link

DonaldPShimoda 214 days ago

At the very least, I think there is a need for oversight of how companies building LLMs market and train their models. It's not enough to cross our fingers that they'll add "safeguards" to try to detect certain phrases/topics and hope that that's enough to prevent misuse/danger — there's not sufficient financial incentive for them to do that of their own accord beyond the absolute bare minimum to give the appearance of caring, and that's simply not good enough.

link

andy99 214 days ago

Pretty sure most of the current problems we see re drug use are a direct result of the nanny state trying to tell people how to live their lives. Forcing your views on people doesn’t work and has lots of negative consequences.

link

daveguy 214 days ago

Okay, I'm intrigued. How in the fuck could the "nanny state" cause people to abuse heroin? Is there a reason other than "just cause it's my ideology"?

link

The_Rob 214 days ago

Comparing LLM responses to heroine is insane.

link

DonaldPShimoda 214 days ago

I'm not saying they're equivalent; I'm saying that they're both dangerous, and I think taking the position that we shouldn't take any steps to prevent the danger because some people may end up thinking they "want" it is unreasonable.

link

thedrexster 214 days ago

heroin is the drug, heroine is the damsel :)

link

yunohn 214 days ago

You’re absolutely right!

The number of heroine addicts is significantly lower than the number of ChatGPT users.

link

lynx97 213 days ago

I am with you. Insane comparisons are the first signs of an activist at work.

link

boredhedgehog 213 days ago

Disincentivizing something undesirable will not necessarily lead to better results, because it wrongly assumes that you can foresee all consequences of an action or inaction.

Someone who now falls in love with an LLM might instead fall for some seductress who hurts him more. Someone who now receives bad mental health assistance might receive none whatsoever.

link

umanwizard 213 days ago

Your argument suggests that we shouldn’t ever make laws or policy of any kind, which is clearly wrong.

link

DonaldPShimoda 213 days ago

I disagree with your premise entirely and, frankly, I think it's ridiculous. I don't think you need to foresee all possible consequences to take action against what is likely, especially when you have evidence of active harm ready at hand. I also think you're failing to take into account the nature of LLMs as agents of harm: so far it has been very difficult for people to legally hold LLMs accountable for anything, even when those LLMs have encouraged suicidal ideation or physical harm of others, among other obviously bad things.

I believe there is a moral burden on the companies training these models to not deliberately train them to be sycophantic and to speak in an authoritative voice, and I think it would be reasonable to attempt to establish some regulations in that regard in an effort to protect those most prone to predation of this style. And I think we need to clarify the manner in which people can hold LLM-operating companies responsible for things their LLMs say — and, preferably, we should err on the side of more accountability rather than less.

---

Also, I think in the case of "Someone who now receives bad mental health assistance might receive none whatsoever", any psychiatrist (any doctor, really) will point out that this is an incredibly flawed argument. It is often the case that bad mental health assistance is, in fact, worse than none. It's that whole "first, do no harm" thing, you know?

link

samdoesnothing 214 days ago

Who are you to determine what other people want? Who made you god?

link

DonaldPShimoda 214 days ago

...nobody? I didn't determine any such thing. What I was saying was that LLMs are dangerous and we should treat them as such, even if that means not giving them some functionality that some people "want". This has nothing to do with playing god and everything to do with building a positive society where we look out for people who may be unable or unwilling to do so themselves.

And, to be clear, I'm not saying we necessarily need to outlaw or ban these technologies, in the same way I don't advocate for criminalization of drugs. But I think companies managing these technologies have an onus to take steps to properly educate people about how LLMs work, and I think they also have a responsibility not to deliberately train their models to be sycophantic in nature. Regulations should go on the manufacturers and distributors of the dangers, not on the people consuming them.

link

pmarreck 214 days ago

here’s something I noticed: If you yell at them (all caps, cursing them out, etc.), they perform worse, similar to a human. So if you believe that some degree of “personable answering” might contribute to better correctness, since some degree of disagreeable interaction seems to produce less correctness, then you might have to accept some personality.

link

throwaway-0001 213 days ago

Interesting codex just did the work once I sweared. Wasted 3-4 prompts being nice. And angry style made him do it.

link

subscribed 213 days ago

Actually DeepSeek performs better for me in terms of prompt adherence.

link

EGreg 214 days ago

ChatGPT 5.2: allow others to control everything about your conversations. Crowd favorite!

link

alooPotato 214 days ago

so good.

link

umanwizard 213 days ago

You’re getting downvoted but I agree with the sentiment. The fact that people want a conversational robot friend is, I think, extremely harmful and scary for humanity.

Giving people what makes them feel good in the short term is not actually necessarily a good thing. See also: cigarettes, alcohol, gambling, etc.

link

angrydev 214 days ago

Exactly. Stop fooling people into thinking there’s a human typing on the other side of the screen. LLMs should be incredibly useful productivity tools, not emotional support.

link

halifaxbeard 214 days ago

How would you propose we address the therapist shortage then?

link

nikkwong 214 days ago

Who ever claimed there was a therapist shortage?

link

joquarky 214 days ago

The process of providing personal therapy doesn't scale well.

And I don't know if you've noticed, but the world is pretty fucked up right now.

link

dash2 213 days ago

... because it doesn't have enough therapists?

link

typpilol 213 days ago

People are so naive if they think most people can solve their problem with a one hour session a week.

link

Galacta7 214 days ago

https://www.statnews.com/2024/01/18/mental-health-therapist-...

link

bloqs 213 days ago

i think most western governments and societies at large

link

treyd 213 days ago

It's a demand side problem. Improve society so that people feel less of a need for theapists.

link

NoGravitas 213 days ago

Oh, so you think we should improve society somewhat, eh? But you yourself live in society. Gotcha!

link

abeppu 214 days ago

I think therapists in training, or people providing crisis intervention support, can train/practice using LLMs acting as patients going through various kinds of issues. But people who need help should probably talk to real people.

link

neilwilson 213 days ago

Remember that a therapist is really a friend you are paying for.

Then make more friends.

link

alterom 213 days ago

>Remember that a therapist is really a friend you are paying for.

That's an awful, and awfully wrong definition that's also harmful.

It's also disrespectful and demeaning to both the professionals and people seeking help. You don't need to get a degree in friendship to be someone's friend. And having friends doesn't replace a therapist.

Please avoid saying things like that.

link

ahmeneeroe-v2 214 days ago

outlaw therapy

link

NullCascade 213 days ago

I don't know why you're being downvoted. Denmark's health system is pretty good except adult mental health. SOTA LLMs are definitely approaching a stage where they could help.

link

93po 214 days ago

something something bootstraps

link

93po 214 days ago

Food should only be for sustenance, not emotional support. We should only sell brown rice and beans, no more Oreos.

link

spaqin 213 days ago

Oreos won't affirm your belief that suicide is the correct answer to your life problems, though.

link

DocTomoe 213 days ago

That is mostly a dogmatic question, rooted in (western) culture, though. And even we have started to - begrudgingly - accept that there are cases where suicide is the correct answer to your life problems (usually as of now restricted to severe, terminal illness).

link

nikkwong 214 days ago

The point the OP is making is that LLMs are not reliably able to provide safe and effective emotional support as has been outlined by recent cases. We're in uncharted territory and before LLMs become emotional companions for people, we should better understand what the risks and tradeoffs are.

link

karianna 214 days ago

I wonder if statistically (hand waving here, I’m so not an expert in this field) the SOTA models do as much or as little harm as their human counterparts in terms of providing safe and effective emotional support. Totally agree we should better understand the risks and trade offs but I wouldn’t be super surprised if they are statistically no worse than us meat bags this kind of stuff.

link

jsrozner 214 days ago

One difference is that if it were found that a psychiatrist or other professional had encouraged a patient's delusions or suicidal tendencies, then that person would likely lose his/her license and potentially face criminal penalties.

We know that humans should be able to consider the consequences of their actions and thus we hold them accountable (generally).

I'd be surprised if comparisons in the self-driving space have not been made: if waymo is better than the average driver, but still gets into an accident, who should be held accountable?

Though we also know that with big corporations, even clear negligence that leads to mass casualties does not often result in criminal penalties (e.g., Boeing).

link

amosjyng 214 days ago

> that person would likely lose his/her license and potentially face criminal penalties.

What if it were an unlicensed human encouraging someone else's delusions? I would think that's the real basis of comparison, because these LLMs are clearly not licensed therapists, and we can see from the real world how entire flat earth communities have formed from reinforcing each others' delusions.

Automation makes things easier and more efficient, and that includes making it easier and more efficient for people to dig their own rabbit holes. I don't see why LLM providers are to blame for someone's lack of epistemological hygiene.

Also, there are a lot of people who are lonely and for whatever reasons cannot get their social or emotional needs met in this modern age. Paying for an expensive psychiatrist isn't going to give them the friendship sensations they're craving. If AI is better at meeting human needs than actual humans are, why let perfect be the enemy of good?

> if waymo is better than the average driver, but still gets into an accident, who should be held accountable?

Waymo of course -- but Waymo also shouldn't be financially punished any harder than humans would be for equivalent honest mistakes. If Waymo truly is much safer than the average driver (which it certainly appears to be), then the amortized costs of its at-fault payouts should be way lower than the auto insurance costs of hiring out an equivalent number of human Uber drivers.

link

layer8 214 days ago

They also are not reliably able to provide safe and effective productivity support.

link

glitchc 214 days ago

Maybe there is a human typing on the other side, at least for some parts or all of certain responses. It's not been proven otherwise..

link

cowpig 214 days ago

I think they get way more "engagement" from people who use it as their friend, and the end goal of subverting social media and creating the most powerful (read: profitable) influence engine on earth makes a lot of sense if you are a soulless ghoul.

link

sofixa 214 days ago

It would be pretty dystopian when we get to the point where ChatGPT pushed (unannounced) advertisements to those people (the ones forming a parasocial relationship with it). Imagine someone complaining they're depressed and ChatGPT proposing doing XYZ activity which is actually a disguised ad.

Other than such scenarios, that "engagement" would be just useless and actually costing them more money than it makes

link

cowpig 214 days ago

Do you have reason to believe they are not doing this already?

link

water9 214 days ago

No, otherwise Sam Altman wouldn’t have had a outburst about revenue. They know that they have this amazing system, but they haven’t quite figured out how to monetize it yet.

link

Hammershaft 213 days ago

Yes, I've heard no reports of poorly fitting branded recommendations from AI models. The PR risk would be huge for labs, the propensity to leak would be high given the selection effects that pull people to these roles.

link

ssl-3 213 days ago

I've not heard of it, either.

But I suspect that we're no more than one buyout away from that kind of thing.

The labs do appear to avoid paid advertising today. But actions today should not be taken as an indicator to mean that the next owner(s) won't behave completely soullessly manner in their effort to maximize profit at every possible expense.

On a long-enough timeline, it seems inevitable to me that advertising with LLM bots will become a real issue.

(I mean: I remember having an Internet experience that was basically devoid of advertising. It changed, and it will never change back.)

link

sofixa 214 days ago

Not really, but with the amounts of money they're bleeding it's bound to get worse if they are already doing it.

link

easygenes 213 days ago

I use the "Nerdy" tone along with the Custom Instructions below to good effect:

"Please do not try to be personal, cute, kitschy, or flattering. Don't use catchphrases. Stick to facts, logic, reasoning. Don't assume understanding of shorthand or acronyms. Assume I am an expert in topics unless I state otherwise."

link

sbuttgereit 214 days ago

This. When I go to an LLM, I'm not looking for a friend, I'm looking for a tool.

Keeping faux relationships out of the interaction never let's me slip into the mistaken attitude that I'm dealing with a colleague rather than a machine.

link

Y_Y 214 days ago

I don't know about you, but half my friends are tools.

link

nathan_compton 214 days ago

You can just tell the AI to not be warm and it will remember. My ChatGPT used the phrase "turn it up to eleven" and I told it never to speak in that manner ever again and its been very robotic ever since.

link

pgsandstrom 214 days ago

I added the custom instruction "Please go straight to the point, be less chatty". Now it begins every answer with: "Straight to the point, no fluff:" or something similar. It seems to be perfectly unable to simply write out the answer without some form of small talk first.

link

joquarky 214 days ago

Aren't these still essentially completion models under the hood?

If so, my understanding for these preambles is that they need a seed to complete their answer.

link

danmaz74 213 days ago

But the seed is the user input.

link

IntrepidPig 213 days ago

Maybe until the model outputs some affirming preamble, it’s still somewhat probable that it might disagree with the user’s request? So the agreement fluff is kind of like it making the decision to heed the request. Especially if we the consider tokens as the medium by which the model “thinks”. Not to anthropomorphize the damn things too much.

Also I wonder if it could be a side effect of all the supposed alignment efforts that go into training. If you train in a bunch of negative reinforcement samples where the model says something like “sorry I can’t do that” maybe it pushes the model to say things like “sure I’ll do that” in positive cases too?

Disclaimer that I am just yapping

link

Auracle 213 days ago

I had a similar instruction and in voice mode I had it trying to make a story for a game that my daughter and I were playing where it would occasionally say “3,2,1 go!” or perhaps throw us off and say “3,2,1, snow!” or other rhymes.

Long story short it took me a while to figure out why I had to keep telling it to keep going and the story was so straightforward.

link

nathan_compton 214 days ago

This is very funny.

link

op00to 213 days ago

Since switching to robot mode I haven’t seen it say “no fluff”. Good god I hate it when it says no fluff.

link

andai 214 days ago

I system-prompted all my LLMs "Don't use cliches or stereotypical language." and they like me a lot less now.

link

water9 214 days ago

They really like to blow sunshine up your ass don’t they? I have to do the same type of stuff. It’s like have to assure that I’m a big boy and I can handle mature content like programming in C

link

moi2388 214 days ago

Same. If i tell it to choose A or B, I want it to output either “A” or “B”.

I don’t want an essay of 10 pages about how this is exactly the right question to ask

link

LeifCarrotson 214 days ago

10 pages about the question means that the subsequent answer is more likely to be correct. That's why they repeat themselves.

link

3836293648 213 days ago

But that goes in the chain of thought, not the response

link

binary132 214 days ago

citation needed

link

porridgeraisin 214 days ago

First of all, consider asking "why's that?" if you don't know what is a fairly basic fact, no need to go all reddit-pretentious "citation needed" as if we are deeply and knowledgeably discussing some niche detail and came across a sudden surprising fact.

Anyways, a nice way to understand it is that the LLM needs to "compute" the answer to the question A or B. Some questions need more compute to answer (think complexity theory). The only way an LLM can do "more compute" is by outputting more tokens. This is because each token takes a fixed amount of compute to generate - the network is static. So, if you encourage it to output more and more tokens, you're giving it the opportunity to solve harder problems. Apart from humans encouraging this via RLHF, it was also found (in deepseekmath paper) that RL+GRPO on math problems automatically encourages this (increases sequence length).

From a marketing perspective, this is anthropomorphized as reasoning.

From a UX perspective, they can hide this behind thinking... ellipses. I think GPT-5 on chatgpt does this.

link

Y_Y 214 days ago

A citation would be a link to an authoritative source. Just because some unknown person claims it's obvious that's not sufficient for some of us.

link

KalMann 214 days ago

Expecting every little fact to have an "authoritative source" is just annoying faux intellectualism. You can ask someone why they believe something and listen to their reasoning, decide for yourself if you find it convincing, without invoking such a pretentious phrase. There are conclusions you can think to and reach without an "official citation".

link

astrange 214 days ago

LLMs have essentially no capability for internal thought. They can't produce the right answer without doing that.

Of course, you can use thinking mode and then it'll just hide that part from you.

link

moi2388 213 days ago

No, even in thinking mode it will sycophant and write huge essays as output.

It can work without, I just have to prompt it five times increasingly aggressively and it’ll output the correct answer without the fluff just fine.

link

qwertytyyuu 213 days ago

They already do hide alot from you when thinking, this person wants them to hide more instead of doing their 'thinking' 'out loud' in the response.

link

totetsu 213 days ago

Zachary Stein makes the case that conferring social statuses to Artificial Intelligences is a ex-risk. https://cic.uts.edu.au/events/collective-intelligence-edu-20...

link

ph4rsikal 214 days ago

Your comment reminded me of this article becasue of the Star Trek comparison. Chatting is inefficient, isn't it?

[1] https://jdsemrau.substack.com/p/how-should-agentic-user-expe...

link

LaFolle 213 days ago

Exactly, and it does't help with agentic use cases that tend to solve problem in on-shot, for example, there is 0 requirement from a model to be conversational when it is trying to triage a support question to preset categories.

link

Tiberium 214 days ago

Are you aware that you can achieve that by going into Personalization in Settings and choosing one of the presets or just describing how you want the model to answer in natural language?

link

gcau 214 days ago

Yea, I don't want something trying to emulate emotions. I don't want it to even speak a single word, I just want code, unless I explicitly ask it to speak on something, and even in that scenario I want raw bullet points, with concise useful information and no fluff. I don't want to have a conversation with it.

However, being more humanlike, even if it results in an inferior tool, is the top priority because appearances matter more than actual function.

link

cmrdporcupine 214 days ago

To be fair, of all the LLM coding agents, I find Codex+GPT5 to be closest to this.

It doesn't really offer any commentary or personality. It's concise and doesn't engage in praise or "You're absolutely right". It's a little pedantic though.

I keep meaning to re-point Codex at DeepSeek V3.2 to see if it's a product of the prompting only, or a product of the model as well.

link

Tiberium 214 days ago

It is absolutely a product of the model, GPT-5 behaves like this over API even without any extra prompts.

link

cmrdporcupine 214 days ago

I prefer its personality (or lack of it) over Sonnet. And tends to produce less... sloppy code. But it's far slower, and Codex + it suffers from context degradation very badly. If you run a session too long, even with compaction, it starts to really lose the plot.

link

SergeAx 208 days ago

Just put it in your system prompt?

link

egorfine 214 days ago

Enable "Robot" personality. I hate all the other modes.

link

kranke155 213 days ago

Gemini is very direct.

link

jasonsb 214 days ago

Engagement Metrics 2.0 are here. Getting your answer in one shot is not cool anymore. You need to waste as much time as possible on OpenAI's platform. Enshittification is now more important than AGI.

link

spaceman_2020 214 days ago

This is the AI equivalent of every recipe blog filled with 1000 words of backstory before the actual recipe just to please the SEO Gods

The new boss, same as the old boss

link

glouwbug 214 days ago

Things really felt great 2023-2024

link

mmcnl 214 days ago

Exactly. The GPT 5 answer is _way_ better than the GPT 5.1 answer in the example. Less AI slop, more information density please.

link

vunderba 214 days ago

And utterly unsurprising given their announcement last month that they were looking at exploring erotica as a possible revenue stream.

[1] https://www.bbc.com/news/articles/cpd2qv58yl5o

link

subscribed 213 days ago

Everyone else provides these services anyway, and many places offer using ChatGPT or Claude models despite current limits (because they work with "jailbraking" prompts), so they likely decided to stop pretending and just let that stuff in.

Whats the problem tbh.

link