Hacker News new | ask | show | jobs
by killerstorm 888 days ago
> it's still pretty striking how far away the structure of it is from the usual idea of automated reasoning/intelligence.

How so? Reasoning is fundamentally a search problem.

The process you described is exactly the process humans use: i.e. make a guess about what's useful, try to work out details mechanically. If get stuck, make another guess, etc. So the process is like searching through a tree.

People figured out this process back in 1955 (and made a working prototype which can prove theorems): https://en.wikipedia.org/wiki/Logic_Theorist but it all hinges on using good 'heuristics'. Neural networks are relevant here as they can extract heuristics from data.

What do you think is "the usual idea of automated reasoning"? Some magic device which can solve any problem using a single linear pass?

3 comments

I think the intuition here is that an “intelligent” system would lean much more heavily on the heuristics than on the search, like humans. It’s hard to quantify in this case, but certainly in the case of chess, when engines were about as good as best human players, they were doing orders of magnitude larger search than humans. Which made them feel more like chess _engines_ than chess AI. AlphaZero certainly made a step in the right direction, but it’s still far from how humans play.
The comment above mentions "brute force" incorrectly. It's fundamentally impossible to brute force googol combinations...
The key insight is this whole thread is that this Alpha Geometry only works because the search field is not a googol combinations. So, it doesn't really generalize to many other fields of math. We shouldn't expect an AlphaCategoryTheory or AlphaNumberTheory anytime soon.
"Brute force" = going through all combinations indiscriminately.

Using heuristics, beam search, etc, = "not brute force".

Calling something smart "brute force" is wrong.

> We shouldn't expect an AlphaCategoryTheory or AlphaNumberTheory anytime soon.

There's already a number of papers demonstrating use of LLMs for math in general:

"Autoformalization with Large Language Models" https://arxiv.org/abs/2205.12615

"Large language models can write informal proofs, translate them into formal ones, and achieve SoTA performance in proving competition-level maths problems!" https://twitter.com/AlbertQJiang/status/1584877475502301184

"Magnushammer: A Transformer-based Approach to Premise Selection" https://twitter.com/arankomatsuzaki/status/16336564683345879...

I find it rather sad that the reaction to amazing research is "but it's just..." and "it might not work on ...". Like, maybe appreciate it a bit and consider a possibility that something similar to it might work?

AlphaGeometry works, in simplified terms, in two stages:

1. Heuristically add a candidate construction using an NN (a transformer)

2. Brute force search through all possible deductions from the current set of constructions using a symbolic solver

If you don't find a solution after step 2, repeat. There may be some backtracking involved to try a different set of constructions as well.

This approach only works because, in geometry, the set of possible deductions from a given construction is actually quite small.

Also, note that this approach overall is essentially an optimization, not amazing new capabilities. Replacing step 1 with a random construction still solves 10 problems on average in the given time, compared to 30 with the new approach. The existing algorithm, relying mostly on brute force search, is probably able to solve all of the geometry problems if given, say, 10 times as much time as the olympiad students (so not some absurd amount of time).

Key insight is the finiteness of reasoning parts in planar geometry that can be quickly solved by the SAT, which often does not exist in most first-order and second-order logics, such as number theory, algebra, or functional analysis
The search space here is not so large. Read this paper, a key part of the present work: https://doi.org/10.1023/A:1006171315513
Well, what's different is that humans invent new abstractions along the way such as complex numbers and Fourier transforms.
A neural network is nothing but a heap of new abstractions from data.
Not a lot of humans do
Every single human has abstractions that are unique to them. Your world model isn’t the same as mine.

It’s just that usually these abstractions are fuzzy and hard to formalize, so they aren’t shared. It doesn’t mean that they don’t exist.

This is laughable... Neural networks also have basically unknowable abstractions encoded in their weights. There was some work not long ago which taught an ANN to do modular arithmetic, and found that it was doing Fourier transforms with the learned weights....
When a human has done the same thing many times they tend to try to generalize and take shortcuts. And make tools. Perhaps I missed something but I haven't seen a neural net do that.
Is that very different than the distillation and amplification process that happens during training? Where the neural net learns to predict in one step what initially required several steps of iterated execution.
IMHO, yes. It's not an (internal) invention occurring as a result of insight into the process - it's an (external) human training one model on the output of another model.
True, but we're talking about Olympiad level math skills here.
> Some magic device which can solve any problem using a single linear pass?

You mean like a calculator?

A calculator that can solve any problem human could, and more? Sure.