Hacker News new | ask | show | jobs
by logicprog 110 days ago
I think it's pretty clear that he hasn't kept up with the state of the art.

He has no idea what coding agents are capable of or how useful they are; he doesn't pay attention to any of the contributions to math or science that these models are making; he continually assists that because agents aren't ready to face customers in uncontrolled environment, they're completely useless even for employees and workers; he just last year posted an article complaining that LLMs don't use web search to find information (he asked the information about a friend), when almost all of them do now, even in their default interfaces; he still thinks hallucinations are a problem with any weight in things like mathematics and programming where it's very easy to verify the types of things hallucinations would cause a problem with; I think he still adheres to the stochastic parrot mindset even though that's not even the most relevant part of their training anymore.

Most importantly, although he seems to have made a single substack post making this argument, it doesn't seem to have really percolated through the rest of his thinking: that the cutting edge of LLMs right now, agents, are actually exactly the kind of neurosymbolic system, where neural networks provide an interface with the outside world and a creativity and problem-solving engine to provide the sort of fuzzy pattern matching and adaptability that is needed, while symbolic code-based systems ensure that guardrails are met, requirements are met, and for accurate information is provided and so on, that he wants. I think his objection might be that the problem is that the problem solving and reasoning engine at the core is still an LLM. But the thing is that you need the kind of pattern matching and flexibility and adaptability that you get from an LLM drive things, to have the end result be anything different than just an expert system with a slightly better natural language interface pasted on. And I think it's pretty clear at this point that expert systems are dead. They haven't done anything as remotely interesting or useful as what we're seeing LLMs do.

I think like another commenter says that his whole stick is pointing out obviously true basic features of LLMs like that they hallucinate or don't perfectly adhere to prompt guardrails, or that there's too much hype in the industry right now, and a lot of the companies suck in a vaguely standard big tech Silicon Valley way, and extrapolating to some broader point, which is that everyone should have listened to him and done what he said when he wrote that book back in the 90s (iirc).

2 comments

I think his claim basically boils down to "if you're expecting AI, LLMs don't cut it". And I think he's basically right on that count. There's a lot of tooling and harnessing being put in place to course correct them on the job, and from the other angle standards are simply being lowered to accommodate them. So they can be made to be useful, but they're still not what you would want from an actual AI. Marcus wants to augment them with symbolic AI. I don't know how feasible that is, but he's not fundamentally against AI, he's just against the notion that LLMs are AI. Which given how they've been marketed and how the public is encouraged to think about them, is a worthwhile point to make.
> "if you're expecting AI, LLMs don't cut it". And I think he's basically right on that count.

This is one of those comments whose truth value depends entirely on a constantly shifting definition of “AI”.

The ability of modern models to functionally understand, answer questions, and make recommendations about software codebases is superhuman at this point, relative to most human software developers. What is that, if not artificial intelligence?

Perhaps you’re thinking of something more like AGI, but even there the terminology is loaded and ambiguous. The models are general enough to answer questions well on a vast range of subjects, and they exhibit understanding (again, functionally speaking this is true - whether someone wants to call them stochastic parrots is beside the point.) The appellation of “intelligence” applies just as well as in the coding case, it’s artificial, and it’s general.

> a worthwhile point to make.

I disagree. Without clear, justified definitions, it’s an incoherent, poorly specified point that seems to be driven by a desire to maintain a specific conclusion regardless of the evidence.

I used to be a Gary Marcus fan, but I guess what confuses me is...

I'm not really sure at that point what 'actual' AI means?

It seems like the definition of actual AI is something like perfect AI — it has to be fully observable, interpretable, reason perfectly, have perfect factual recall, continual learning, infinite context windows, perfect instruction following, and so on. I feel like at that point, maybe nothing could ever be 'actual' AI?

We typically use AI to mean some kind of algorithm or program that lets computers do intellectual work that was previously considered to be the exclusive domain of humans, especially if it involves problem solving or pattern matching or reasoning. Just look at Donald Knuth's recent posts about what Claude was able to do — seems like AI to me?

Yeah, it is in perfect AI, but it's still AI. And it's not clear to me that the imperfections that LLMs have mean that they can't be extremely useful and revolutionary as a form of AI. Yes, they make weird mistakes a lot, and they don't think at all like humans do. But I am of the opinion that there are a lot of forms of intelligence, and human intelligence is just one of them. And every kind of intelligence comes with its own different gamut of continual errors that it will tend to make, blind spots and biases. The fact that LLMs have issues that are different from the form of intelligence humans have and also different from what computers have issues with doesn't discount them from being intelligent to me.

I also think the framing of agentic harnesses as being bolted onto LLM's in order to "make them useful", but agentic harness plus LLMs not counting as an AI system itself very odd — I think it's pretty clear to me at least that "the AI", if you want to talk about it, is the neurosymbolic cybernetic feedback system that combines the harness and the LLM.

The LLM is only the sort of fuzzy pattern matching logic and creativity core; the harness provides verification feedback loops, the ability to interact with and explore the outside world, the ability to bring in programming language interpreters and so on in order to do more rigid symbolic logic, observability, systems for storing and recalling memory for continual learning, and so on, and I think a lot of these, especially feedback loops, resolve a lot of issues that LLMs seem to inherently face, such as hallucinations.

Moreover, LLMs are now substantially trained with writing code and using tools and interacting with the world and existing in harnesses in mind. At this point, I would have to guess that more than half of their training is actually devoted to rewarding them for correctly using all of these symbolic tools and solving problems in a simulated world than just predicting the next token.

I also think that LLMs, as a sort of core engine of an agentic harness, are allowing computers to do things we'd never really dreamed they could do before, that symbolic systems by themselves never really achieved, and as I said before, if you're looking for neurosymbolic AI — as Marcus says he is — then this is basically how it's going to have to look unless you want to fall down the expert system rabbit hole again.

> he doesn't pay attention to any of the contributions to math or science that these models are making

Ok but why report PR pieces as evidence for LLMs being useful?

These are tools that can possibly provide output that is eventually correct. It is the human behind the wheel doing the actual work.

Give the tool to a lesser expert and you will get more garbage with fewer lucky shots.

For the elite, it is a balancing act where more often than not, the cost of making LLM do the work is less than doing it yourself. If this percentage is above 90% of the time, the tool is useful.

No that's new. This was the previous "accelerating science" piece.

https://openai.com/index/accelerating-science-gpt-5/?hl=en-G...