Hacker News new | ask | show | jobs
by samrus 3 hours ago
Yeah but the investments arent aiming for churning out SaaS apps. Its to automate large swathes of intellectual labour. Of which only SWE has been cracked yet. There is a question mark as to if the others will crack. If they arent then these investments will collapse from speculation down to reality. That possibility is what is being discussed here

As to whether that will happen, I think that risk is real. Because claude code isnt made by the generalozed capabilities of the tech but by good old non-generalozable hueristics and rule based engines. I dont think that will scale to other feilds at the factor these investments assume. Its the bitter lesson again. It scales with deliberate and specific design, not data, so it wont scale

We learnt this with ibm watson. Deepblue achieved chess supremacy but the last mile wasnt data driven, it was heiristic driven, and so watson, its successor, couldnt scale/generalize.

My prediction is that this speculation on LLMs with harnesses will collapse since they wont scale. We'll have another winter where the reasearchers will be leaft alone long wnough to come up with the next breakthrough (probably game theory based data driven agency) which might then create what this hypecycle is speculating

3 comments

There’s argument to be made that SWE hasn’t been cracked either. The latest models are great at coding medium sized applications, but figuring out the requirements and consolidation of domain knowledge is something still lacking
Yeah. I think LLMs wont be able to do that on their own or with hueristics. We'll need to bake game theory into the base model for that
We will find out how much of work is given to people just so that there's a person/company associated with a technical decision. I personally think this might be quite high.
we already know this, the term bullshit jobs exist for this reason.
Exactly. I build automation tools for my company which have improved productivity quite a lot and put precisely zero people out of a job. Partly we find other things for them to spend time on, and partly it just turns out that we like to have humans doing jobs.

It is cliche at this point that HN is the place you go to hear software developers reduce all of the world's problems into simple algorithmic arguments which for some reason never actually solve anything. Not shocked that we are similarly incapable of understanding that algorithmically replacing a software developer isn't easy just because we think we know what the job is.

you mentioned a very good point about scalability. we're seeing alot of productivity gains, but only from SWEs, which are but a very small segment of the global economy. all other economic use cases require thorough last-mile development and iteration that is not too different with current automation tools.
A friend who is a psychologist was telling me he thinks in another year or two insurance companies will insist people see an AI therapist first before being willing to pay for a real person.
Tracks. This is that same speculation that AI will be good enough. Wonder what a crash in that isecase will look like? Increased suicide rates? More instances of psychosis? It might not wven be directly measurable or easily traceable to AI therapists. Would suck
LOL, the same AI that has landed companies in court defending themselves against wrongful death lawsuits for helping someone convince themselves suicide is the right answer, and even encouraging them? That AI? I am unclear that any insurance company is going to want a piece of that action anytime soon.

What you've just told me is that psychologists, just like SWEs, are prone to thinking they know how business works but in fact know fuck all.

All those automaton tools will eventually be initially one-shotted and then monitored by LLMs though. There probably won't be a "last mile" per se; just constant tweaking and optimizations throughout, within a feedback loop.
What your describing is iterating in the last mile. Your assuming that the AI will iterate in the last mile with the same efficacy that it iterates before that. I think that will fail. Thats the bitter lesson, that adhoc solutions to the last mile (rather than generalozed solutions that scale with training data) asymtotally stall and so dont scale.

Meaning claude code wont be able to make a "claude video editing" or "claude accounting" with the current tech. Human experts will need to encode their knowledge into it for the last mile and that wont scale the way these speculations expect

We aren't seeing productivity gains in software either. What we are seeing is a lot of people who claim to be more productive, but in fact are building piles of tech debt that will fall over before long. But hey, they're building that tech debt really fast!
> but in fact are building piles of tech debt that will fall over before long

This is speculation as well. Its well founded but speculation nonetheless. Youre speculating things will stay the way they have till now.

I do see your point but what makes me consoder the other side is that ive been building an app that reaches ~10k LOC, purely with opus, no code review at all, and it hasnt hit any tech debt issues that i havent easily been able to address. Setting up good context management meant that claude could just figure things out itself.

And for reference this is an app that manages an ethernet camera, runs vehicle detection on the stream, and surfaces the detections on an ipad for operators to inspect and annotate for cellphone usage, so not trivial. Needed good architechtong and design from my end, but it was honestly scarily easy. So idk what the threshold for tech debt crash is but it wasnt there