Hacker News new | ask | show | jobs
by throwaway132448 100 days ago
I found the article confusing. Its premise seems to be that alternative methods to deep learning “work”, and only faded out due to other factors, yet keeps referencing scenarios in which they demonstrably failed to “work”. Such as:

> In 2012, Alex Krizhevsky submitted a deep convolutional neural network to the ImageNet Large Scale Visual Recognition Challenge. It won by 9.8 percentage points over the nearest competitor.

Maybe there’s another definition of “works” that’s implicit and I’m not getting, but I’m struggling to picture a definition relevant to the history-of-deep-learning narrative they are trying to explain.

3 comments

It seems to be an indirect attempt to promote their GitHub project. They had Claude make them an “agent” using Bayesian modeling and Thompson sampling and now they are convinced they have heralded a new era of AI.
It reads to me like Claude wrote the article too.
I think the worst thing about the golden age of symbolic AI was that there was never a systematic approach to reasoning about uncertainty.

The MYCIN system was rather good at medical diagnostics and like other systems of the time had an ad-hoc procedure to deal with uncertainty which is essential in medical diagnosis.

The problem is that is not enough to say "predicate A has a 80% of being true" but rather if you have predicate A and B you have to consider the probability of all four of (AB, (not A) B, A (not B), (not A) (not B)) and if it is N predicates you have to consider joint probabilities over 2^N possible situations and that's a lot.

For any particular situation the values are correlated and you don't really need to consider all those contingencies but a general-purpose reasoning system with logic has to be able to handle the worst case. It seems that deep learning systems take shortcuts that work much of the time but may well hit the wall on how accurate they can be because of that.

[1] https://en.wikipedia.org/wiki/Mycin

Symbolic AI ala Mycin and other expert systems didn't do anything that a modern database query engine can't do with far greater performance. The bottleneck is coming up with the set of rules that the system is to follow.
Early production rules engines really sucked, like a lot of the time they didn't have any kind of indexes and full scanned a lot. Good RETE engines with indexes didn't get mainstream by the 1980s but the industry was already losing interest. In a lot of ways

https://en.wikipedia.org/wiki/Drools

is pretty good as is the Jena rules engine but none of these have ways of dealing with uncertainty which are necessary if you're going to be working with language and having to decide which of 10,000 possible parses is right for a sentence. People used to talk as if 10,000 rules was a lot but handling 2 million well-organized rules with Drools is no problem at all today.

I think the problems of knowledge base construction are overstated and that a lack of tools are the problem. Or rather, the Cyc experience shows that rules are not enough, that is, after Lenat died it got out that Cyc didn't just have a big pile of facts and rules and a general reasoning procedure but it had a large database of algorithms to solve specific problems. That is, in principle you can solve anything with an SMT solver but if you actually try it you'll find you can code up a special-purpose algorithm to do common tasks before the SMT solver really gets warmed up.

Part of the production rules puzzle is that there never was a COBOL of business rules rather you got different systems which took different answers to various tricky problems like how to control the order of execution when it matters, how to represent negation, etc.

I think what they're saying is the methods used today are faster but have a lower ceiling, and that that's why they quickly took over but can only go so far.
That would be a hypothesis, not a fact.

I'm not closed to it. You can check my comment history for frequent references to next-generation AIs that aren't architected like LLMs. But they're going to have to produce an AI of some sort that is better than the current ones, not hypothesize that it may be possible. We've got about 50 years of hypothesis about how wonderful such techniques may be and, by the new standards of 2026, precious few demonstrations of it.

Quoting from the article:

"Within five years, deep learning had consumed machine learning almost entirely. Not because the methods it displaced had stopped working, but because the money, the talent, and the prestige had moved elsewhere."

That one jumped right out at me because there's a slight-of-hand there. A more correct quote would be "Not because the methods it displaced had stopped working as well as they ever have, ..." Without that phrase, the implication that other techniques were doing just as well as our transformer-based LLMs is slipped in there, but it's manifestly false when brought up to conscious examination. Of course they haven't, unless they're in the form of some probably-beyond-top-secret AI in some government lab somewhere. Decades have been poured into them and they have not produced high-quality AIs.

Anyone who wants to produce that next-gen leap had probably better have some clear eyes about what the competition is.

> That would be a hypothesis, not a fact.

I agree.