| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by Terretta 9 days ago

You think someone is, or even should, special case things like estimates? What else deserves that level of intervention so they look less dumb?

Logistics for getting to the car wash next door?

In the mean time, alas, no, we can see from actual prompts sent directly or through sub-agents, and actual replies, estimates remain LLM generated.

Though, this discussion here could change that, because indeed there is a lot of special casing and context stuffing going on, one of the oldest being today's date for example.

• • •

I did read the Claude Code leak, and use pi, etc. So I disagree with your premise rather strongly. Today's "systems" remain, roughly, piles of markdown and context engineering wrapped in UI affordances, and behave very similarly today to how they did in 2024 for those already engineering context and delegating.

2 comments

ghshephard 9 days ago

I do a lot of code bisecting with Claude Code - and it spends hours running experiments - looking at experiment results, making guesses as to what to try next for an experiment - until it eventually comes around to a working code pattern. I mean - maybe this is as much a reflection on me as anything else - but it's pattern of logic isn't that much different from what I would do. It knows, in general, what tools and APIs it can call - it tries something - observes the result, and then comes back and tries different experiments based on success/failure - mostly efficiently bisecting to a solution.

I'm still lower-down of the capability scale - as I'm still manually directing agents to do these wiggins loops - obviously the next step up is to direct the code-loops which control the agents. I just haven't got my tooling nailed in place to the point where I find that's more productive.

I actually might agree with you that this is mostly just "next token prediction" - if I can concede that's really all I do as well.

link

Terretta 9 days ago

> I actually might agree with you that this is mostly just "next token prediction" - if I can concede that's really all I do as well.

Yep. Pretty sure I've got an LLM inside too.

The other replies complaining that my thinking is so 2023 -- on the contrary, what's evolved is my own apprehension of how LLM-like most "responses" from humans prove as well.

To be sure, there are other mechanisms at play as well, significant differentiation in our... Volume of training material? Quantizations/compression? Model architecture? Just-ahead-of-time forward branching with back propagation? Double loop adaptive learning? You know, harnessing the LLM. :-) Dare we call it executive function?

LLM mode becomes particularly apparent when conversing with Alzheimer's patients in the stage where short term memories do not form but they retain access to long term memory up to, say, 5 years ago or so. Fifty years of who they are, and one can trigger nearly identical responses with nearly identical prompts.

But that same person may be able to debate 1950s politics while being unable to complete making a sandwich.

If they didn't know of new shortcuts for a task, would almost certainly not "estimate" but "intuit", or "instictively" respond (apply heuristics), largely based on their "priors" aka training material.

If you sit with them and chat a while, you'll even get the kind of looping you get from Qwen trying to think when context is too full.

And if we believe this at all, then ... we should stop scrolling tik tok. Time to read a book. Have an experience. Fine tune. :-)

link

8note 9 days ago

rather than special casing, make real data based on chat logs for how long things took both in calendar and chat time

link