Hacker News new | ask | show | jobs
by thom 781 days ago
Up until about GPT 2, EURISKO was arguably the most interesting achievement in AI. Back in the day on the SL4 and singularitarian mailing lists, it was spoken of in reverent tones, and I’m sure I remember a much younger Eliezer Yudkowsky cautioning that Doug Lenat should have perceived a non-zero chance of hard takeoff at the moment of its birth. I suspect its achievements were slightly overblown and heavily guided by a human hand, but it’s still fascinating and definitely worthy of study. Genetic programming hasn’t yielded many interesting results since, and the unreasonably effectiveness of differentiable programming and backpropagation has sucked up much of the oxygen in the room. But not everything is differentiable, the combination of the two still seems worth investigating, and EURISKO goes to show the power of heuristic approaches to some problems.
6 comments

> the combination of the two still seems worth investigating

This.

Back in the late 1980's and early 90's the debate-du-jour was between deliberative and reactive control systems for robots. I got my Ph.D. for simply saying that the entire debate was based on the false premise that it had to be one or the other, that each approach had its strengths and weaknesses, and that if you just put the two together the whole would be greater than the sum of its parts. (Well, it was a little more than that. I had to actually show that it worked, which was more work that simply advancing the hypothesis, but in retrospect it seems kinda obvious, doesn't it?)

If I were still in the game today, combining generative-AI and old-school symbolic reasoning (which has also advanced a lot in 30 years) would be the first thing I would focus my attention (!) on.

People have advanced that argument a lot, and it's often worked for a short while; then the statistical models get better.

Chess was a game for humans.

It was very briefly a game for humans and machines (Kasparov had a go at getting "Advanced Chess" off the ground as a competitive sport), but soon enough having a human in the team made the program worse.

But at least the evaluation functions were designed by humans, right? That lasted a remarkably long time; first Stockfish became the strongest engine in the world by using distributed hyperparameter search to tweak its piece-square tables, then AlphaZero came along and used a policy network + MCTS instead of alpha-beta search, then (with an assist from the Shogi community) Stockfish struck back with a completely learned evaluation function via NNUE.

So the last frontier of human expertise in chess is search heuristics, and that's going to fall too: https://arxiv.org/abs/2402.04494.

The common theme with all of this is that the stuff which we used before are, fundamentally, hacks to get around _not having enough compute_, but which make the system worse once you don't have to make those tradeoffs around inductive biases. Empirical evidence suggests that raw scaling has a long way to run yet.

I find myself not wanting to agree with you, but deep down I think you're right.

AI greatly reminds me of the Library of Babel thought experiment. If we can imagine a library with every book that can possibly be written in any language, would it contain all human knowledge lost in a sea of noise? Is there merit or value in creating a system that sifts through such a library to attune hidden truths, or are we dooming ourselves to finding meaning in nothingness?

In a certain sense, there's immense value to developing concepts and ideas through intuition and thought. In another sense, a rose by any other name smells just as sweet; if an AI creates a perpetual motion device before a human does, that's not nothing. I don't expect AI to speed past human capability like some people do, but it's certainly displaced a lot of traditional computer-vision and text generation applications.

> If we can imagine a library with every book that can possibly be written in any language, would it contain all human knowledge lost in a sea of noise? Is there merit or value in creating a system that sifts through such a library to attune hidden truths, or are we dooming ourselves to finding meaning in nothingness?

The work that system would be required to find those "hidden truths" is equivalent to re-deriving those truths from scratch.

Similar argument: an image is just a number; if you take e.g. a 800x600 24bpp picture, that's a number 1 440 000 bytes long; you could hypothetically start from 0 and generate every 1 440 000-byte number, thus generating every possible 800x600 24bit image. In that set, you'd find every historical event, photographed at every moment from every angle, and even photos of every fragment of every book from the Library of Babel. But good luck finding anything particular in there.

Similar argument 2: any movie or song is contained somewhere within digital expansion of the number Pi. But again, it's worthless unless you know how to find such works, which basically requires you to have them in the first place.

> then the statistical models get better

Maybe. The statistical models are definitely better at natural language processing now, but they still fail on analytical tasks.

Of course, human brains are statistical models, so there's an existence proof that a sufficiently large statistical model is, well, sufficient. But that doesn't mean that you couldn't do better with an intelligently designed co-processor. Even humans do better with a pocket calculator, or even a sheet of paper, than they do with their unaided brains.

If human brains are statistical models, why are human brains so bad at statistics?

Edt: btw, same for probabilistic inference, same for logical inference, and same for any other thing anyone's tried as the one true path to AI since the 1950's. Humans have consistently proven bad at everything computers are good at, and that tells us nothing about why humans are good at anything (if, indeed, we are). Let's not assume too much about brains until we find the blueprint, eh?

> why are human brains so bad at statistics?

That depends on what you mean by being "bad at statistics." What brains do on a conscious level is very different than what they do at a neurobiological level. Brains are "bad at statistics" on the conscious level, but at the level of neurobiology that's all they do.

As an analogy, consider a professional tennis or baseball player. At the neurobiological level those people are extremely good at finding solutions to kinematic equations, but that doesn't mean that they would ace a physics test.

That is a very big assumption -that brains have conscious and subconscious levels that are good and bad at different things- that needs to be itself proved, before it can be used to support any other line of inquiry.

I'm not well versed in the relevant literature at all but my understanding is that research in the area points to the completely opposite direction: that humans e.g. playing baseball do not find solutions to kinematic equations, but instead use simple heuristics that exploit our senses and body configuration, like placing their hands in front of their eyes so that they line up with the ball etc.

This makes a lot more sense, not only for humans playing tennis, but for animals surviving in the wild, finding sustenance and shelter, and mates, while avoiding becoming a meal. Consider the Portia spider [1], a spider-hunting spider, itself prey to other hunting spiders, with a brain consisting of a few tens of thousands of neurons and still perfectly capable not only of navigating complex environments in all three space dimensions but also making complex plans involving detours.

Just think of how quickly a spider must be able to think that hunts, and is hunted by other spiders -some of the most deadly predators in the animal kingdom. There is no chance of a snowball in hell that such an animal has the time to solve kinematic equations with a few KBs of neurons. Absolutely no chance at all.

For that and many other stuff like that it looks very unlikely to me that human brains, or any brains, are like you say. In any case, that sounds positively Freudian and I don't mean that as an insult, but I so could.

______________

[1] My favourite. No, I don't mean meal. I just love this paper; it's almost the best paper in autonomous robotics and planning that I've ever read:

https://www.frontiersin.org/journals/psychology/articles/10....

> If human brains are statistical models, why are human brains so bad at statistics?

If CPUs are made of silicon, why are they so bad at simulating semiconductors? Or why CPUs are so bad at emulating CPUs?

If JavaScript runs on a CPU, why is it so bad at doing bitwise stuff?

Etc.

What the runtime is made of is entirely separate of what's running on it. Same is with human brain (substrate) and human consciousness (software), or humans (substrate) and bureaucracy (runtime) and corporations (software).

Your question implies it is obvious that a system of statistical models would (or should) be good at statistics. And that the opposite is a paradox. I would ask why you think that is obvious?

Being good at statistics is more of a knowledge graph of understanding concepts than a statistical model, I think.

Just like understanding a car engine.

That's the "bitter lesson", right? Which is really a sour lesson- as in sour grapes. See, Rich Sutton's point with his Bitter Lesson is that encoding expert knowledge only improves performance temporarily, which is eventually surpassed by more data and compute.

There are only two problems with this: One, statistical machine learning systems have an extremely limited ability to encode expert knowledge. The language of continuous functions is alien to most humans and it's very difficult to encode one's intuitive, common sense knowledge into a system using that language [1]. That's what I mean when I say "sour grapes". Statistical machine learning folks can't use expert knowledge very well, so they pretend it's not needed.

Two, all the loud successes of statistical machine learning in the last couple of decades are closely tied to minutely specialised neural net architectures: CNNs for image classification, LSTMs for translation, Transformers for language, Difussion models and Ganns for image generation. If that's not encoding knowledge of a domain, what is?

Three, because of course three, despite point number two, performance keeps increasing only as data and compute increases. That's because the minutely specialised architectures in point number two are inefficient as all hell; the result of not having a good way to encode expert knowledge. Statistical machine learning folk make a virtue out of necessity and pretend that only being able to increase performance by increasing resources is some kind of achievement, whereas it's exactly the opposite: it is a clear demonstration that the capabilities of systems are not improving [2]. If capabilities were improving, we should see the number of examples required to train a state-of-the-art system either staying the same, or going down. Well, it ain't.

Of course the neural net [community] will complain that their systems have reached heights never before seen in classical AI, but that's an argument that can only be sustained by the ignorance of the continued progress in all the classical AI subjects such as planning and scheduling, SAT solving, verification, automated theorem proving and so on.

For example, and since planning is high on my priorities these days, see this video where the latest achievements in planning are discussed (from 2017).

https://youtu.be/g3lc8BxTPiU?si=LjoFITSI5sfRFjZI

See particularly around this point where he starts talking about the Rollout IW(1) symbolic planning algorithm that plays Atari from screen pixels with performance comparable to Deep-RL; except it does so online (i.e. no training, just reasoning on the fly):

https://youtu.be/g3lc8BxTPiU?si=33XSM6yK9hOlZJnf&t=1387

Bitter lesson my sweet little ass.

____________

[1] Gotta find where this paper was but none other than Vladimir Vapnik basically demonstrated this by trying the maddest experiment I've ever seen in machine learning: using poetry to improve a vision classifier. It didn't work. He's spent the last 20 years trying to find a good way to encode human knowledge into continuous functions. It doesn't work.

[2] In particular their capability for inductive generalisation which remains absolutely crap.

Yeah, that's one of the papers in that line of research by Vapnik. He's got a few with similar content. Visually, it's not the paper I remember, I'll have to read it again to be sure.

If I remember correctly, Vapnik's point is, we know that Big Data Deep Learning works; now, try to do the same thing with small data. Very much like my point that capabilities of models are not improving, only the scale increasing.

> The language of continuous functions is alien to most humans and it's very difficult to encode one's intuitive, common sense knowledge into a system using that language

In other words; machine learned models are octopus brains (https://www.scientificamerican.com/article/the-mind-of-an-oc...) and that creeps you out. Fair enough, it creeps me out too, and we should honour our emotions — I'm no rationalist – but we should also be aware of the risks of confusing our emotional responses with reality.

Please don't god mode me? Machine learning doesn't creep me out. I'm sorry it creeps you out. In my culture, octopus is a prized delicacy, my dad used to fish them out of the sea with his bare hands when I was a kid. If you wanna creep me out, you should try snake, not octopus.
>Two, all the loud successes of statistical machine learning in the last couple of decades are closely tied to minutely specialised neural net architectures: CNNs for image classification, LSTMs for translation, Transformers for vision, Difussion models and Ganns for image generation. If that's not encoding knowledge of a domain, what is?

Transformers, Diffusion for Vision, Image generation are really odd examples here. None of those architectures or training processes are tuned for Vision in mind lol. It was what? 3 years after Attention 2017 before the famous Vit paper. CNNs have lost a lot of favor to Vits, LSTMs are not the best performing translators today.

The bitter lesson is that less encoding of "expert" knowledge results in better performance and this has absolutely held up. The "encoding of knowledge" you call these architectures is nowhere near that of the GOFAI kind and even more than that, less biased NN architectures seem to be winning out.

>That's because the minutely specialised architectures in point number two are inefficient as all hell; the result of not having a good way to encode expert knowledge.

Inefficient is a whole lot better than can't even play the game, the story of GOFAI for the last few decades.

>If capabilities were improving, we should see the number of examples required to train a state-of-the-art system either staying the same, or going down. Well, they ain't.

The capabilities of models are certainly increasing. Even your example is blatantly wrong. Do you realize how much more data and compute it would take to train a Vanilla RNN to say GPT-3 level performance?

>> Inefficient is a whole lot better than can't even play the game, the story of GOFAI for the last few decades.

See e.g. my link above where GOFAI plays the game (Atari) very well indeed.

Also see Watson winning Jeopardy (a hybrid system, but mainly GOFAI - using frames and Prolog for knowledge extraction, encoding and retrieval).

And Deep Blue beating Kasparov. And MCTS still the SOTA search algo in Go etc.

And EURISCO playing Traveller as above.

And Pluribus playing Poker with expert game-playing knowledge.

And the recent neuro-symbolic DeepMind thingy that solves geometry problems from the maths olympiad.

etc. etc. [Gonna stop editing and adding more as they come to my mind here.]

And that's just playing games. As I say in my comment above planning and scheduling, SAT, constraints, verification, theorem proving- those are still dominated by classical systems and neural nets suck at them. Ask Yan LeCun: "Machine learning sucks". He means it sucks in all the things that classical AI does best and he means he wants to do them with neural nets, and of course he'll fail.

> And MCTS still the SOTA search algo in Go etc

It's often forgotten that Rich Sutton said the two things which work are learning (the AlphaGo/Leela Zero policy network) and search (MCTS). (I think the most interesting research in ML is around the circumstances in which large models wind up performing implicit search.)

That was a figure of speech. I didn't literally mean games (not that GOFAI performs better than NNs in those games anyway). I simply went off your own examples - Vision, Image generation, Translation etc.

>As I say in my comment above planning and scheduling, SAT, constraints, verification, theorem proving- those are still dominated by classical systems

You can use NNs for all these things. It wouldn't make a lot of sense because GOFAI would be perfect and the former would be inefficient but you certainly could which is again more than I can say for GOFAI and the domains you listed.

Addendum:

>> Do you realize how much more data and compute it would take to train a Vanilla RNN to say GPT-3 level performance?

Oh, good point. And what would GPT-3 do with the typical amount of data used to train an LSTM? Rhetorical.

Yeah, all of those architectures are _themselves_ hacks to get around having insufficient compute! They absolutely were encoding inductive biases into the network to get around not being able to train enough, and transformers (handwaving hard enough to levitate, the currently-trainable model family with the least inductive bias) have eaten the world in all domains.

This is evidence _for_ the Bitter Lesson, not against it.

They haven't (eaten the world etc). They just happen to be the models that trend hard right now. I bet if you could compare like for like you'd be able to see some improvement in performance from Transformers, but that 'd be extremely hard to separate from the expected improvement from the constantly increasing amounts of data and compute. For example, you could, today, train a much bigger and deeper Multi-Layered Perceptron than you could thirty years ago, but nodoy is trying because that's so 1990's, and in any case they have the data and compute to train much bigger, much more inefficient (contrary to what you say if I got that right) architectures.

Wait a few years and the Next Big Thing in AI will come along, hot on the heels of the next generation of GPUs, or tensor units or whatever the hardware industry can cook up to sell shovels for the gold rush. By then, Transfomers will have hit the plateau of diminishing returns, there'll be gold in them there other hills and nobody would talk of LLMs anymore because that's so 2020s. We've been there so many times before.

> Up until about GPT 2, EURISKO was arguably the most interesting achievement in AI.

I'm really baffled by such statement and genuinely curious.

How come that studying GOFAI as undergraduate and graduate at many European universities, doing a PhD. and working in the field for several years _never_ exposed me to EURISKO up until last week (thanks to HN)?

I heard about Cyc, many formalism and algorithms that related to EURISKO, but never heard of its name.

Is EURISKO famous in US only?

> Is EURISKO famous in US only?

It was featured in a BBC radio series on AI made by Colin Blakemore [1] around 1980, the papers on AM and EURISKO were in the library of the UK university that I attended.

[1] https://en.wikipedia.org/wiki/Colin_Blakemore#Public_engagem...

For that reason, a comparison between GPT 2 and EURISKO seems funny to me.

I discussed ChatGPT with my yoga teacher recently, but I bet not even my IT colleagues would have a clue about EURISKO. :-)

So? There's a real possibility DART has still saved its customers more money over its lifetime than GPT has, and odds are basically 100% that your yoga teacher and IT colleagues haven't heard a thing about it either. The general public has all sorts of wrong impressions and unknown unknowns of facts that I don't see why they should ever be used as a technology industry benchmark by anyone not working in the UI department of a smartphone vendor.
"... I’m sure I remember a much younger Eliezer Yudkowsky cautioning that Doug Lenat should have perceived a non-zero chance of hard takeoff at the moment of its birth."

https://www.lesswrong.com/posts/rJLviHqJMTy8WQkow/recursion-...

Also, in 2009 someone suggested re-implementing Eurisko[1], and Yudkowsky cautioned against it:

> This is a road that does not lead to Friendly AI, only to AGI. I doubt this has anything to do with Lenat's motives - but I'm glad the source code isn't published and I don't think you'd be doing a service to the human species by trying to reimplement it.

To my mind -- and maybe this is just the benefit of hindsight -- this seems way too overcautious on Yudkowsky's part.

[1]: https://www.lesswrong.com/posts/t47TeAbBYxYgqDGQT/let-s-reim...

Machinery can be a lot simpler than biology. Birds are incredibly complex systems: wing structure, musculature, feathers, etc. An airplane can be a vaguely wing-shaped piece of metal and a pulse jet. It doesn’t seem super implausible that there is some algorithm that is to human consciousness what a pulse jet with wings is to a bird. Maybe LLMs are that, but maybe they’re far more than is really needed because we don’t yet know what we are doing.

I would bet against it being possible to implement consciousness on a PDP, but I wouldn’t be very confident about it.

> a much younger Eliezer Yudkowsky cautioning that Doug Lenat should have perceived a non-zero chance of hard takeoff at the moment of its birth

Why is Yudkowsky taken seriously? This stuff is comparable to the "LHC micro black holes will destroy Earth" hysteria.

There are actual concerns around AI like deep fakes, a deluge of un-filterable spam, mass manipulation via industrial scale propaganda, mass unemployment created by widespread automation leading to civil unrest, opaque AIs making judgements that can't be evaluated properly, AI as a means of mass appropriation of work and copyright violation, concentration of power in large AI companies, etc. The crackpot "hard takeoff" hysteria only distracts from reasonable discourse about these risks and how to mitigate them.

Perhaps we can disagree on the shape of the curve, but it seems likely that ever more capable AI will enable ever more serious harms. Absolutely true that we should counter those harms in the present and not fixate on a theoretical future, but the medicine is much the same either way.
> Why is Yudkowsky taken seriously?

  Trivialities    Annoyances    Immediate harm     X-Risk
  |------------------------------------------------------|
         \----stuff you mention-------/
                                 \---stuff Eliezer------/
                                      wrote about
> The crackpot "hard takeoff" hysteria only distracts from reasonable discourse about these risks and how to mitigate them.

IDK, I feel endless hand-wringing about copyright and deepfakes distract from risks of actual, significant harm at scale, some of which you also mentioned.

> "LHC micro black holes will destroy Earth" hysteria.

I will be heavily downvoted for this, but here is how I remember it:

1) LHC was used to study blackholes and prove things like Hawking radiation

2) LHC was supposed to be safe due to Hawking radiation (that was only an unproven theory at the time)

So the unpopular question: what if Hawking radiation didnt actually exist? Wouldnt there be a risk of us dying? A small risk, but still some risk? (especially as the potential micro black hole would have the same velocity as earth, so it wouldnt fly away somewhere into space)

On a side note: how would EURISCO evaluate this topic?

Since I read about this secretive CYC (why u can email asking for it, but source not hosted anywhere?): couldnt any current statistics based AI be used to feed this CYC program / database with information? Take a dictionary and ask ChatGPT to fill it with information for each word.

The fundamental reason that hysteria was silly is that Earth is bombarded by cosmic rays that are far stronger than anything done in the LHC. The reason we built the LHC is so we can do observable repeatable experiments at high energies, not to reach energies never reached on Earth before.

The AI hysteria I'm talking about here is the "foom" hysteria, the idea that a sufficiently powerful model will start self-improving without bound and become some kind of AI super-god. That's about as wild as the LHC will make a black hole that will implode the Earth. There are fundamental reasons to believe it's impossible, such as the question of "where would the information come from to drive that runaway intelligence explosion?"

There are legitimate risks with AI, but not because AI is somehow special and magical. All technologies have risks. If you make a sharper stick, someone will stab someone with it. Someday we may make a stick so sharp it stabs the entire world (cue 50s sci-fi theremin music).

Edit: for example... I would argue that the Internet itself has X-risks. The Internet creates an environment that incentivizes an arms race for attention grabbing, and the most effective strategies usually rely on triggering negative emotions and increasing division. This could run away to the point that it drives, say, civilizational collapse or a global thermonuclear war. Does this mean it would have been right to ban the Internet or require strict licensing to place any new system online?

You remember ... wrongly. It's just another particle accelerator, its intention was not to produce micro black holes for study.

You shouldn't use "theory" when it comes to science unless you know what that means. Gravity is a "theory." "Theory" means that it has a working model, comes with a ton of observational evidence in line with predictions, and it has yet to be replaced by anything better. Outside of math, nothing is ever proven. Any leading scientific theory is, at best, "yet to be disproven." And it stays in the lead until something better comes along: more accurate, extending over a greater domain, etc.

Hawking radiation has yet to be observed.

And if you're worried about micro black holes, well, even an iron atom has a non-zero chance of tunneling to a micro black hole state. No collider needed.

Cyc isn't secretive, it's proprietary, the way the Microsoft codebase is, the Adobe codebase is, and so on.

> Why is Yudkowsky taken seriously?

People like religion, particularly if it doesn't affect how they live their life _today_ too much. You get all of the emotional benefits of feeling like you're doing something virtuous without the effort of actually performing good works.

Not really. Read [1], which references "Why AM and Eurisko appear to work". There's a reason that line of development did not continue.

[1] https://news.ycombinator.com/item?id=28343118

> Up until about GPT 2, EURISKO was arguably the most interesting achievement in AI.

I agree.

> I suspect its achievements were slightly overblown and heavily guided by a human hand

So do I. We'll find out how much of its performance was real, and how much bullshit.

> the unreasonably effectiveness of differentiable programming and backpropagation has sucked up much of the oxygen in the room

The Bitter Lesson -- http://www.incompleteideas.net/IncIdeas/BitterLesson.html