| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by disillusioned 62 days ago
	It's also routinely failing the car wash question across all models now, which wasn't the case a month ago. :-/ Seeing some things about how the effort selector isn't working as intended necessarily and the model is regressing in other ways: over-emphasizing how "difficult" a problem is to solve and choosing to avoid it because of the "time" it would take, but quoted in human effort, or suggesting the "easier" path forward even if it's a hack or kludge-filled solution.

6 comments

tetraodonpuffer 61 days ago

it does feel something in the hidden system prompt makes it try less hard, so many times in the past several weeks I have found divergences with what was in plan and looking back at the jsonl it's always some variant of "doing it this way would be too complicated, let me take this hardcoded way out". If asked to review the change, it will find it, and it will say also yeah I agree prompt said not to do this, but I did anyways, not sure why.

As others have said, anthropic is between a rock and a hard place, you can't scale compute as quickly, and the influx of new accounts has definitely made things tough for them: I think all the "how is claude this session 1/2/3/4" questions that keep coming up must be part of some a/b on just how far to quantize / lower thinking while still maintaining user satisfaction.

link

andai 62 days ago

> over-emphasizing how "difficult" a problem is to solve and choosing to avoid it because of the "time" it would take

I heard a while back Claude refused to attempt a task for days, saying it would take weeks of work. Eventually the user convinced it to try, and it one-shotted it in 30 seconds.

link

apetresc 62 days ago

For days? Someone spent days trying to convince Claude to do something?

link

layer8 62 days ago

If you asked yesterday, and asked again today, then you asked for days. OP might be trying to express that it wasn’t just a temporary fluke.

link

empath75 61 days ago

I have noticed refusals as context windows grow.

link

_blk 62 days ago

Awesome, I didn't know about the car wash question.

Totally true, also tokens seem to burn through much faster. More parallelism could explain some of it but where I could work on 3-5 projects at once on the max plan a month ago, I can't even get one to completion now on the same Opus model before the 5h session locks me up..

link

themafia 61 days ago

Step 1: Sell at a loss.

Step 2: Panic.

Step 3: Destroy product.

link

colechristensen 61 days ago

>“idgaf about risk you coward, waste some time just do it and stop bitching”

The above was a successful prompt to get Claude to stop whining about effort, difficulty, and time.

Unfortunately abusive language well placed is an effective LLM motivator.

link

itemize123 61 days ago

are you sure other forms of language to express urgency doesn't work as well or better?

link

colechristensen 60 days ago

They're just words. It's not a person. It doesn't "understand" anything. (I sound like the bad guy in a robots-have-feelings movie)

I've also tried giving LLMs religion to much more limited success (haven't figured out the right way yet).

I'm manipulating a language model, not a person. "fuck you" translates into a vector in a really big space, and it has different results than being polite about it.

In that prompt I'm reenforcing a directive in five different ways

- idgaf about risk

- you coward

- waste some time

- just do it

- stop bitching

This cluster of instructions are all related but in slightly different directions, are unambiguously strong, attention grabbing, and direct and the model does not argue or get confused about intent

In this particular instance this was the fifth time I had given a particular instruction only to have it subverted by the model that had decided "that's too hard I'm going to do something else instead" in four separate ways.

Abusive cursing did indeed work better than any other form of urgency or insistence.

link

theshrike79 61 days ago

Am I the only one who couldn't care less if a model can answer a weird gotcha riddle or not?

I never use it to answer questions like that, what I care about is consistent tool callig and following the prompt.

link