Hacker News new | ask | show | jobs
by richardjennings 80 days ago
I was not aware the default effort had changed to medium until the quality of output nosedived. This cost me perhaps a day of work to rectify. I now ensure effort is set to max and have not had a terrible session since. Please may I have a "always try as hard as you can" mode ?
5 comments

I feel like the maximum effort mode kind-of wraps around and starts becoming "desperate" to the extent of lazy or a monkey's paw, similar to how lower effort modes or a poor prompt.
I’m going in circles. Let me take a step back and try something completely different. The answer is a clean refactor.

Wait, the simplest fix is the same hack I tried 45 minutes ago but in a different context. Let me just try that.

Wait,

Wait, the linter re-ordered the file. Let me restore it to the previous state.

whisper: There is no linter.

Those test failures are pre-existing. We're all done!
Wait, I should check if they pre-exist on master.

    < 1,000 prompts for compound cd && git commands that can't be safely auto-accepted >
I think over-thinking is only solved by thinking more, not less. This is only viable once some intelligence threshold is reached, which I think Anthropic has borderline achieved.

  > I think over-thinking is only solved by thinking more, not less.
Despite "thinking" tokens being determined by the preceding tokens, they still are taken from some probability distribution, just a complex one. This means that at each token selection step there is a probability P_e of an error, of selecting a wrong token.

These errors compound exponentially: the probability of not selecting wrong token for N steps is 1-(1-P_e)^N.

The shorter "thinking" is, the less is the probability of it going astray.

> The shorter "thinking" is, the less is the probability of it going astray

As long as the error introduced by more steps is less than the compounding error of sub-optimal token sampling, I would expect a better result.

I think your choice of "wrong" is extreme, suggesting such a token can catastrophically spoil the result. The modern reality is more that the model is able to recover.

this might be just my impression, but I feel like most people are using CC for fixing their React frontends, and they prefer the decreased latency and less tokens spent as opposed to performing well on extremely difficult problems?

That said there's still an issue of regression to the mean. What the average person likes, as determined by metrics, is something nobody actuallt likes, because the average is a mathematical construct and might not describe any particular individual accurately.

That's /effort max!
You cannot control the effort setting sub-agents use and you also cannot use /effort max as a default (outside of using an alias).
export CLAUDE_CODE_EFFORT_LEVEL=max
Thank you!

Worth mentioning that setting this via effortLevel in .claude/settings.json does not work. https://github.com/anthropics/claude-code/issues/35904

Does that apply to subagents?
agree.
bad citizen