Hacker News new | ask | show | jobs
by datsci_est_2015 42 days ago
The word “incorporate” is doing some very heavy lifting in your assertion. These LLMs already have access to the whole corpus of architectural knowledge and software best practices, and yet they’re unable to reliably implement those best practices. Why not? Why do they often make completely unintuitive decisions, even when repeatedly prompted to ask clarifying questions?
2 comments

To be clear by that and "cultural corpus" I meant their skill with natural languages. It is well known for instance that early LLMs were curiously better at composing sentences in English than doing basic math.

Regarding such formal reasoning we have already seen marked improvement in the last year or two alone. The question is how this weighs on your prediction re their capabilities in the next two, five, ten, etc years.

What are the properties of LLMs that have convinced you that there remains emergent complexity (e.g. the “ability” to formally reason) that we have not yet seen?
There may be gains to be had in such emergence but that is not where I see the gains in the next five years. Those gains will be made by connecting LLMs more robustly with formal reasoning, which computers are already very good at. Continued iteration on connecting these right/left brain faculties could then lead to further emergence down the line.

The present notions of harnesses, structured output or looping in the LLM to some external state or sandbox be it debugger output or embedding into a runtime already show early promising results along these lines. I see no reason to believe these gains will not continue over the next five years.

If you have some theories in the converse in that regard I am all ears.

Extraordinary claims require extraordinary evidence, not the opposite. There’s no current evidence to suggest limitless progress, or even superlinear progress with regards to compute and energy. My guess would be sub linear or even logarithmic progress vs. linear growth in compute and energy, as that’s how most physical systems behave.
No one said unlimited progress. Let's not revert to straw man claims.

If you think the potential of LLMs is overblown feel free to short the market. I don't pretend to know the future. But if I may, I don't think you are framing the debate in the correct terms. Evidence is an important facet of human affairs. So is risk. Best of luck with your predictions.

Markets can remain irrational longer than anyone can stay solvent (especially when wealth is as concentrated as it is currently: one doofus can keep an entire industry afloat).

“Unlimited progress” is not a statement on the rate of progress, it’s a statement on the limits of progress. It’s a much weaker claim than you’re framing it as. Your claim very much is that we have not yet reached the limits of LLMs potential. My claim, conversely, is that we’re already reaching diminishing returns, which are being masked by a massive influx of compute and energy. My short: LLMs are not the path to AGI.

I really don't like this framing - it's hard to short a market at the best of times, let alone when governments have a vested interest in tech being too big to fail to compete in the global economic arms race - see Intel's stock in the past few months.

I agree with you both - undoubtedly there are still massive gains to be made with the frontier models we have today with tooling and iteration, yet I do not believe there's sufficient evidence to claim we are rolling towards AG/SI on an exponential curve, without some additional breakthroughs given the jagged edges and data used to train models being fundamentally linear

> Why do they often make completely unintuitive decisions

Most likely because you haven't constrained their behavior in your prompt. You're making the assumption that they "understand" that using best practices is what you want. You have to tell them that, and tell them which practices they should use.

They already fail consistently follow very simple and concrete instructions like “Please do not ever mock this object, always properly construct it in your tests”, so I’m not sure how they’re going to adhere to more vague and conceptual architectural paradigms. This is a problem with generative AI in general - image generation has similar limitations.
Senior developers know what behavior to constrain.

If incorrect LLM output is a prompt issue then demand for experienced developers will remain, and demand may actually increase as time passes.