Up until about GPT 2, EURISKO was arguably the most interesting achievement in AI. Back in the day on the SL4 and singularitarian mailing lists, it was spoken of in reverent tones, and I’m sure I remember a much younger Eliezer Yudkowsky cautioning that Doug Lenat should have perceived a non-zero chance of hard takeoff at the moment of its birth. I suspect its achievements were slightly overblown and heavily guided by a human hand, but it’s still fascinating and definitely worthy of study. Genetic programming hasn’t yielded many interesting results since, and the unreasonably effectiveness of differentiable programming and backpropagation has sucked up much of the oxygen in the room. But not everything is differentiable, the combination of the two still seems worth investigating, and EURISKO goes to show the power of heuristic approaches to some problems.
> the combination of the two still seems worth investigating
This.
Back in the late 1980's and early 90's the debate-du-jour was between deliberative and reactive control systems for robots. I got my Ph.D. for simply saying that the entire debate was based on the false premise that it had to be one or the other, that each approach had its strengths and weaknesses, and that if you just put the two together the whole would be greater than the sum of its parts. (Well, it was a little more than that. I had to actually show that it worked, which was more work that simply advancing the hypothesis, but in retrospect it seems kinda obvious, doesn't it?)
If I were still in the game today, combining generative-AI and old-school symbolic reasoning (which has also advanced a lot in 30 years) would be the first thing I would focus my attention (!) on.
People have advanced that argument a lot, and it's often worked for a short while; then the statistical models get better.
Chess was a game for humans.
It was very briefly a game for humans and machines (Kasparov had a go at getting "Advanced Chess" off the ground as a competitive sport), but soon enough having a human in the team made the program worse.
But at least the evaluation functions were designed by humans, right? That lasted a remarkably long time; first Stockfish became the strongest engine in the world by using distributed hyperparameter search to tweak its piece-square tables, then AlphaZero came along and used a policy network + MCTS instead of alpha-beta search, then (with an assist from the Shogi community) Stockfish struck back with a completely learned evaluation function via NNUE.
So the last frontier of human expertise in chess is search heuristics, and that's going to fall too: https://arxiv.org/abs/2402.04494.
The common theme with all of this is that the stuff which we used before are, fundamentally, hacks to get around _not having enough compute_, but which make the system worse once you don't have to make those tradeoffs around inductive biases. Empirical evidence suggests that raw scaling has a long way to run yet.
I find myself not wanting to agree with you, but deep down I think you're right.
AI greatly reminds me of the Library of Babel thought experiment. If we can imagine a library with every book that can possibly be written in any language, would it contain all human knowledge lost in a sea of noise? Is there merit or value in creating a system that sifts through such a library to attune hidden truths, or are we dooming ourselves to finding meaning in nothingness?
In a certain sense, there's immense value to developing concepts and ideas through intuition and thought. In another sense, a rose by any other name smells just as sweet; if an AI creates a perpetual motion device before a human does, that's not nothing. I don't expect AI to speed past human capability like some people do, but it's certainly displaced a lot of traditional computer-vision and text generation applications.
> If we can imagine a library with every book that can possibly be written in any language, would it contain all human knowledge lost in a sea of noise? Is there merit or value in creating a system that sifts through such a library to attune hidden truths, or are we dooming ourselves to finding meaning in nothingness?
The work that system would be required to find those "hidden truths" is equivalent to re-deriving those truths from scratch.
Similar argument: an image is just a number; if you take e.g. a 800x600 24bpp picture, that's a number 1 440 000 bytes long; you could hypothetically start from 0 and generate every 1 440 000-byte number, thus generating every possible 800x600 24bit image. In that set, you'd find every historical event, photographed at every moment from every angle, and even photos of every fragment of every book from the Library of Babel. But good luck finding anything particular in there.
Similar argument 2: any movie or song is contained somewhere within digital expansion of the number Pi. But again, it's worthless unless you know how to find such works, which basically requires you to have them in the first place.
Maybe. The statistical models are definitely better at natural language processing now, but they still fail on analytical tasks.
Of course, human brains are statistical models, so there's an existence proof that a sufficiently large statistical model is, well, sufficient. But that doesn't mean that you couldn't do better with an intelligently designed co-processor. Even humans do better with a pocket calculator, or even a sheet of paper, than they do with their unaided brains.
If human brains are statistical models, why are human brains so bad at statistics?
Edt: btw, same for probabilistic inference, same for logical inference, and same for any other thing anyone's tried as the one true path to AI since the 1950's. Humans have consistently proven bad at everything computers are good at, and that tells us nothing about why humans are good at anything (if, indeed, we are). Let's not assume too much about brains until we find the blueprint, eh?
That depends on what you mean by being "bad at statistics." What brains do on a conscious level is very different than what they do at a neurobiological level. Brains are "bad at statistics" on the conscious level, but at the level of neurobiology that's all they do.
As an analogy, consider a professional tennis or baseball player. At the neurobiological level those people are extremely good at finding solutions to kinematic equations, but that doesn't mean that they would ace a physics test.
> If human brains are statistical models, why are human brains so bad at statistics?
If CPUs are made of silicon, why are they so bad at simulating semiconductors? Or why CPUs are so bad at emulating CPUs?
If JavaScript runs on a CPU, why is it so bad at doing bitwise stuff?
Etc.
What the runtime is made of is entirely separate of what's running on it. Same is with human brain (substrate) and human consciousness (software), or humans (substrate) and bureaucracy (runtime) and corporations (software).
Your question implies it is obvious that a system of statistical models would (or should) be good at statistics. And that the opposite is a paradox. I would ask why you think that is obvious?
Being good at statistics is more of a knowledge graph of understanding concepts than a statistical model, I think.
That's the "bitter lesson", right? Which is really a sour lesson- as in sour grapes. See, Rich Sutton's point with his Bitter Lesson is that encoding expert knowledge only improves performance temporarily, which is eventually surpassed by more data and compute.
There are only two problems with this: One, statistical machine learning systems have an extremely limited ability to encode expert knowledge. The language of continuous functions is alien to most humans and it's very difficult to encode one's intuitive, common sense knowledge into a system using that language [1]. That's what I mean when I say "sour grapes". Statistical machine learning folks can't use expert knowledge very well, so they pretend it's not needed.
Two, all the loud successes of statistical machine learning in the last couple of decades are closely tied to minutely specialised neural net architectures: CNNs for image classification, LSTMs for translation, Transformers for language, Difussion models and Ganns for image generation. If that's not encoding knowledge of a domain, what is?
Three, because of course three, despite point number two, performance keeps increasing only as data and compute increases. That's because the minutely specialised architectures in point number two are inefficient as all hell; the result of not having a good way to encode expert knowledge. Statistical machine learning folk make a virtue out of necessity and pretend that only being able to increase performance by increasing resources is some kind of achievement, whereas it's exactly the opposite: it is a clear demonstration that the capabilities of systems are not improving [2]. If capabilities were improving, we should see the number of examples required to train a state-of-the-art system either staying the same, or going down. Well, it ain't.
Of course the neural net [community] will complain that their systems have reached heights never before seen in classical AI, but that's an argument that can only be sustained by the ignorance of the continued progress in all the classical AI subjects such as planning and scheduling, SAT solving, verification, automated theorem proving and so on.
For example, and since planning is high on my priorities these days, see this video where the latest achievements in planning are discussed (from 2017).
See particularly around this point where he starts talking about the Rollout IW(1) symbolic planning algorithm that plays Atari from screen pixels with performance comparable to Deep-RL; except it does so online (i.e. no training, just reasoning on the fly):
[1] Gotta find where this paper was but none other than Vladimir Vapnik basically demonstrated this by trying the maddest experiment I've ever seen in machine learning: using poetry to improve a vision classifier. It didn't work. He's spent the last 20 years trying to find a good way to encode human knowledge into continuous functions. It doesn't work.
[2] In particular their capability for inductive generalisation which remains absolutely crap.
It sounds kinda crazy (is there really that much far transfer?), but you know, I think it would work... He just needed to use LLMs instead: https://arxiv.org/abs/2309.10668#deepmind
Yeah, that's one of the papers in that line of research by Vapnik. He's got a few with similar content. Visually, it's not the paper I remember, I'll have to read it again to be sure.
If I remember correctly, Vapnik's point is, we know that Big Data Deep Learning works; now, try to do the same thing with small data. Very much like my point that capabilities of models are not improving, only the scale increasing.
> The language of continuous functions is alien to most humans and it's very difficult to encode one's intuitive, common sense knowledge into a system using that language
In other words; machine learned models are octopus brains (https://www.scientificamerican.com/article/the-mind-of-an-oc...) and that creeps you out. Fair enough, it creeps me out too, and we should honour our emotions — I'm no rationalist – but we should also be aware of the risks of confusing our emotional responses with reality.
Please don't god mode me? Machine learning doesn't creep me out. I'm sorry it creeps you out. In my culture, octopus is a prized delicacy, my dad used to fish them out of the sea with his bare hands when I was a kid. If you wanna creep me out, you should try snake, not octopus.
>Two, all the loud successes of statistical machine learning in the last couple of decades are closely tied to minutely specialised neural net architectures: CNNs for image classification, LSTMs for translation, Transformers for vision, Difussion models and Ganns for image generation. If that's not encoding knowledge of a domain, what is?
Transformers, Diffusion for Vision, Image generation are really odd examples here. None of those architectures or training processes are tuned for Vision in mind lol. It was what? 3 years after Attention 2017 before the famous Vit paper. CNNs have lost a lot of favor to Vits, LSTMs are not the best performing translators today.
The bitter lesson is that less encoding of "expert" knowledge results in better performance and this has absolutely held up. The "encoding of knowledge" you call these architectures is nowhere near that of the GOFAI kind and even more than that, less biased NN architectures seem to be winning out.
>That's because the minutely specialised architectures in point number two are inefficient as all hell; the result of not having a good way to encode expert knowledge.
Inefficient is a whole lot better than can't even play the game, the story of GOFAI for the last few decades.
>If capabilities were improving, we should see the number of examples required to train a state-of-the-art system either staying the same, or going down. Well, they ain't.
The capabilities of models are certainly increasing. Even your example is blatantly wrong. Do you realize how much more data and compute it would take to train a Vanilla RNN to say GPT-3 level performance?
>> Inefficient is a whole lot better than can't even play the game, the story of GOFAI for the last few decades.
See e.g. my link above where GOFAI plays the game (Atari) very well indeed.
Also see Watson winning Jeopardy (a hybrid system, but mainly GOFAI - using frames and Prolog for knowledge extraction, encoding and retrieval).
And Deep Blue beating Kasparov. And MCTS still the SOTA search algo in Go etc.
And EURISCO playing Traveller as above.
And Pluribus playing Poker with expert game-playing knowledge.
And the recent neuro-symbolic DeepMind thingy that solves geometry problems from the maths olympiad.
etc. etc. [Gonna stop editing and adding more as they come to my mind here.]
And that's just playing games. As I say in my comment above planning and scheduling, SAT, constraints, verification, theorem proving- those are still dominated by classical systems and neural nets suck at them. Ask Yan LeCun: "Machine learning sucks". He means it sucks in all the things that classical AI does best and he means he wants to do them with neural nets, and of course he'll fail.
Yeah, all of those architectures are _themselves_ hacks to get around having insufficient compute! They absolutely were encoding inductive biases into the network to get around not being able to train enough, and transformers (handwaving hard enough to levitate, the currently-trainable model family with the least inductive bias) have eaten the world in all domains.
This is evidence _for_ the Bitter Lesson, not against it.
> Up until about GPT 2, EURISKO was arguably the most interesting achievement in AI.
I'm really baffled by such statement and genuinely curious.
How come that studying GOFAI as undergraduate and graduate at many European universities, doing a PhD. and working in the field for several years _never_ exposed me to EURISKO up until last week (thanks to HN)?
I heard about Cyc, many formalism and algorithms that related to EURISKO, but never heard of its name.
It was featured in a BBC radio series on AI made by Colin Blakemore [1] around 1980, the papers on AM and EURISKO were in the library of the UK university that I attended.
So? There's a real possibility DART has still saved its customers more money over its lifetime than GPT has, and odds are basically 100% that your yoga teacher and IT colleagues haven't heard a thing about it either. The general public has all sorts of wrong impressions and unknown unknowns of facts that I don't see why they should ever be used as a technology industry benchmark by anyone not working in the UI department of a smartphone vendor.
"... I’m sure I remember a much younger Eliezer Yudkowsky cautioning that Doug Lenat should have perceived a non-zero chance of hard takeoff at the moment of its birth."
Also, in 2009 someone suggested re-implementing Eurisko[1], and Yudkowsky cautioned against it:
> This is a road that does not lead to Friendly AI, only to AGI. I doubt this has anything to do with Lenat's motives - but I'm glad the source code isn't published and I don't think you'd be doing a service to the human species by trying to reimplement it.
To my mind -- and maybe this is just the benefit of hindsight -- this seems way too overcautious on Yudkowsky's part.
Machinery can be a lot simpler than biology. Birds are incredibly complex systems: wing structure, musculature, feathers, etc. An airplane can be a vaguely wing-shaped piece of metal and a pulse jet. It doesn’t seem super implausible that there is some algorithm that is to human consciousness what a pulse jet with wings is to a bird. Maybe LLMs are that, but maybe they’re far more than is really needed because we don’t yet know what we are doing.
I would bet against it being possible to implement consciousness on a PDP, but I wouldn’t be very confident about it.
> a much younger Eliezer Yudkowsky cautioning that Doug Lenat should have perceived a non-zero chance of hard takeoff at the moment of its birth
Why is Yudkowsky taken seriously? This stuff is comparable to the "LHC micro black holes will destroy Earth" hysteria.
There are actual concerns around AI like deep fakes, a deluge of un-filterable spam, mass manipulation via industrial scale propaganda, mass unemployment created by widespread automation leading to civil unrest, opaque AIs making judgements that can't be evaluated properly, AI as a means of mass appropriation of work and copyright violation, concentration of power in large AI companies, etc. The crackpot "hard takeoff" hysteria only distracts from reasonable discourse about these risks and how to mitigate them.
Perhaps we can disagree on the shape of the curve, but it seems likely that ever more capable AI will enable ever more serious harms. Absolutely true that we should counter those harms in the present and not fixate on a theoretical future, but the medicine is much the same either way.
Trivialities Annoyances Immediate harm X-Risk
|------------------------------------------------------|
\----stuff you mention-------/
\---stuff Eliezer------/
wrote about
> The crackpot "hard takeoff" hysteria only distracts from reasonable discourse about these risks and how to mitigate them.
IDK, I feel endless hand-wringing about copyright and deepfakes distract from risks of actual, significant harm at scale, some of which you also mentioned.
> "LHC micro black holes will destroy Earth" hysteria.
I will be heavily downvoted for this, but here is how I remember it:
1) LHC was used to study blackholes and prove things like Hawking radiation
2) LHC was supposed to be safe due to Hawking radiation (that was only an unproven theory at the time)
So the unpopular question: what if Hawking radiation didnt actually exist?
Wouldnt there be a risk of us dying? A small risk, but still some risk? (especially as the potential micro black hole would have the same velocity as earth, so it wouldnt fly away somewhere into space)
On a side note: how would EURISCO evaluate this topic?
Since I read about this secretive CYC (why u can email asking for it, but source not hosted anywhere?): couldnt any current statistics based AI be used to feed this CYC program / database with information? Take a dictionary and ask ChatGPT to fill it with information for each word.
The fundamental reason that hysteria was silly is that Earth is bombarded by cosmic rays that are far stronger than anything done in the LHC. The reason we built the LHC is so we can do observable repeatable experiments at high energies, not to reach energies never reached on Earth before.
The AI hysteria I'm talking about here is the "foom" hysteria, the idea that a sufficiently powerful model will start self-improving without bound and become some kind of AI super-god. That's about as wild as the LHC will make a black hole that will implode the Earth. There are fundamental reasons to believe it's impossible, such as the question of "where would the information come from to drive that runaway intelligence explosion?"
There are legitimate risks with AI, but not because AI is somehow special and magical. All technologies have risks. If you make a sharper stick, someone will stab someone with it. Someday we may make a stick so sharp it stabs the entire world (cue 50s sci-fi theremin music).
Edit: for example... I would argue that the Internet itself has X-risks. The Internet creates an environment that incentivizes an arms race for attention grabbing, and the most effective strategies usually rely on triggering negative emotions and increasing division. This could run away to the point that it drives, say, civilizational collapse or a global thermonuclear war. Does this mean it would have been right to ban the Internet or require strict licensing to place any new system online?
You remember ... wrongly. It's just another particle accelerator, its intention was not to produce micro black holes for study.
You shouldn't use "theory" when it comes to science unless you know what that means. Gravity is a "theory." "Theory" means that it has a working model, comes with a ton of observational evidence in line with predictions, and it has yet to be replaced by anything better. Outside of math, nothing is ever proven. Any leading scientific theory is, at best, "yet to be disproven." And it stays in the lead until something better comes along: more accurate, extending over a greater domain, etc.
Hawking radiation has yet to be observed.
And if you're worried about micro black holes, well, even an iron atom has a non-zero chance of tunneling to a micro black hole state. No collider needed.
Cyc isn't secretive, it's proprietary, the way the Microsoft codebase is, the Adobe codebase is, and so on.
People like religion, particularly if it doesn't affect how they live their life _today_ too much. You get all of the emotional benefits of feeling like you're doing something virtuous without the effort of actually performing good works.
The confluence of happenstance that occurs to make this a reality is pretty amazing to witness.
Unfortunately it starts with the passing of Douglas Lenat. But that enabled Stanford to open up their 40 year old archive, which they still had, of Lenats work.
Somehow, someway, someone not only stumbled upon EURISKO, but also knew what it was. One of the most notorious AI research projects of the age that actually broke out of the research labs of Stanford and out into the public eye, with impactful results. Granted, for arguably small values of “public” and “impactful”, but for the small community it affected, it made a big splash.
Lenat used EURISKO to find a very unconventional winning configuration to go on to win a national gaming tournament. Twice.
In that community, it was a big deal. The publisher changed the rules because of it, but Lenat returned victorious again the next year. After a discussion with the game and tournament sponsors, he never came back.
Apparently EURISKO has quite a reputation in the symbolic AI world, but even there it was held close.
But now it has been made available. Not only made available, but made operational. EURISKO is written in an obsolete Lisp dialect, Interlisp. But, coincidentally, we have today machine simulators that can run versions of that Lisp on long lost, 40 year machines.
And someone was able to port it. And it seems to run.
The thought of the tendrils through time that had to twist their way for us to get here leaves, at least me, awestruck. So much opportunity for the wrong butterfly to have been stepped on to prevent this from happening.
But it didn’t, and here we are. Great job by the spelunkers who dug this up.
Enough of the Traveller tournament story is dodgy and inconsistent enough that it's very hard to say what actually happened beyond Lenat winning the tournament twice in a row with some kind of computer assistance,
Basically, with the Traveller tournament Lenat appears to have stumbled onto a story that caught the public's imagination, and then through the milked it for all he could to give his project publicity and to make it appear more
successful than it actually was. And if that required embellishing the story or just making shit up, well, no harm no foul.
Even when something is technically true, it often turns out that it's being told in a misleading way. For example, you say that "the publisher changed the ruleset". That was the entire gimmick of the Traveller TCS tournament rules! The printed rulebook had a preset progression of tournament rules for each year.
EURISKO is basically a series of genetic algorithms over lisp code - the homoiconic nature of lisp making it effectively a meta-optimizer. Amongst many problems was that the solution space, even for things like "be interesting and true", was way too large.
Eurisko (Gr., I discover) is a discovery system written by Douglas Lenat in RLL-1, a representation language itself written in the Lisp programming language. A sequel to Automated Mathematician, it consists of heuristics, i.e. rules of thumb, including heuristics describing how to use and change its own heuristics
It got a lot of kudos for winning a multi player naval wargame by building a bizarre but successful fleet that exploited all the loopholes and quirks in the rules.
IIRC it had at least one small (?) purely defensive boat that couldn’t be destroyed by typical weapons so its parent fleet couldn’t be defeated. It wasn’t like a modern drone swarm
It makes me think of the battles in Doc Smith’s Lensman series where the Galactic Patrol would develop a game-breaking fleet formation to use against Boskone in every major naval battle.
Can anyone give a clear example of how this can be used productively? Its description doesn't help much.
What can one do with EURISKO? The fact of its recovery after its authors passing is interesting, in and of itself - but why is EURISKO, specifically, worth the effort of understanding?