| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by turingbike 2499 days ago
	They give examples of problems their model could solve that Mathematica couldn't (within a 30 second timeout) - and that's awesome. Destroy Mathematica. But, I did anyone notice if there were problems that it couldn't solve that Mathematica could?

2 comments

nfltn 2499 days ago

I'm also curious whether there are problems that Mathematica can solve but this system cannot.

More importantly, I'm curious if there are problems that Mathematica knows it can't solve but for which this system silently gives wrong answers.

Another interesting extension to the experiments would be a longer timeout -- 30 seconds seems a bit arbitrary and quite low for a CAS. However, I suspect the reason for that time out is the fact that Mathematica licenses are insanely expensive. Otherwise the 5,000 (actually, only 500) test problems could be run for at least a few minutes at pretty trivial cost. Maybe there's a Mathematica employee here who can suggest Wolfram donate some compute (or at least limited licenses) for a small evaluation cluster. Especially if the authors decide to do follow-up work.

In any case, this is really interesting work. I think deep learning for symbolic mathematics is going to be a super interesting area to watch for a least the next few years. Good work, anonymous author(s).

link

wendyshu 2499 days ago

Verifying a candidate solution for these problems is relatively easy so wrong answers aren't so bad.

link

nfltn 2499 days ago

I understand.

To explain: the thing that's super interesting to me about this paper (i.e., "strong result" vs. "best paper contender") is not integration per se. It's the possible applications of the method to problems with much, much, much higher computational complexity than integration. On those problems, validating the correctness of a solution is also intractable. In those cases, a sound function approximation approach would be an absolute game changer for symbolic methods.

(Not that integration isn't interesting as well.)

link

wendyshu 2499 days ago

How are they going to generate training data if verifying solutions is hard?

link

nfltn 2499 days ago

Some of these decision problems have thousands of examples because they correspond to industrially relevant problems. So, not automatically generated all at once, but gleaned from people who have been using CAS for decades to solve specific problems.

Still, I fear, the numbers are currently too small to get past the information bottleneck (mere thousands). We'll see.

link

shmageggy 2495 days ago

Are these gathered in one place anywhere? I and probably many others, including the authors of this paper, would be interested in these as a test set for models like this.

link

ms013 2499 days ago

Why not just use the wolfram engine for developers? It’s available for the “insanely expensive” cost of $0. (See: https://www.wolfram.com/engine/)

link

nfltn 2499 days ago

I've had a lot of trouble getting permission to use Wolfram Engine. If authors are at a BigCorp, might be true for them as well.

link

__initbrian__ 2499 days ago

"we report the accuracy of our models on the three different tasks, on a holdout test set composed of 5000 equations."

I had trouble finding the test cases they used. Where'd they list them?

link

leni536 2499 days ago

The cases where Mathematica solves integration by spitting out all kinds of exotic functions (Bessel functions, all kind of weird elliptic integral functions and so on). They don't have these kind of integrals in their training data.

link