Hacker News new | ask | show | jobs
by dreis_sw 3 days ago
Seems like this model delivers on what has already been scaling quite nicely, which is the length and complexity of the requested tasks, but isn't such a big improvement on what hasn't been scaling so far - common sense, discernment, good judgement.
1 comments

> common sense, discernment, good judgement

I feel like the whole point of all the experimentation with AI right now is determining whether any of these things actually matter to the end result, over various timeframes.

It's well known that companies with an abundance of raw technical skills but poor judgement tend to fail. On the technical side technical debt accumulates, while on the business side the wrong choices are made. I think it's valid to generalize this to AI.
They matter.
Because?
Sometimes you want the machine to be an advisor and sanity check your suggestions.

Sometimes you just want it to do the boilerplate you have in mind without trying to reason everything from first principles.

I told you to check fields "foo" and "bar" for values "baz" and "quux". You don't need to go diving through the entire source tree to discover where and how this is set.

I guess maybe it's helpful for the vibe-coded audience-- if it tries to over-process everything, there's a better chance it will work on a single shot, but I'm taking the Crazy Taxi approach: you get points if you drop me off within 20 metres of where I wanted to go, and I can correct it if I specified the wrong response message in the original approach.

Because poor judgement leads to poor decisions.
poor decisions are about context, direction and volition.

All things LLMs will never have; sure AI might one day, but these systems are really good at solving complex problems with fantastical solutions while every force is just one hallucination away.

simonw should spend more time trying to figure the sources of the information it used; that would be a wild ride, use the AI for all I care, we're all standing on the shoulders of giants but sourcing the giant as some mysterical thing.