| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by viraptor 1058 days ago

> The model's growing size makes it better at guessing the outcome of a reasoning task, but little to no actual reasoning is performed.

If there's no observable difference between the behaviours, why not call it as the post did?

> LLMs will still fail miserably at (larger) math problems

They're neither trained on such problems, nor is that a goal for LLMs. They can however tell you how to convert that problem into steps that can be run in an algebra system.

2 comments

FishInTheWater 1058 days ago

You're missing the point, there is a difference; The answers are often wrong, and more-wrong the more complex the question gets.

They're only able to answer simple (relative-to-the-model's-size) straightforward reasoning questions. Which is a nice party trick, but not broadly useful.

They can however tell you how to convert that problem into steps that can be run in an algebra system.

Usually they can't do that very well either. Converting a problem from one description to another is algebraic reasoning, subject to the issues already mentioned.

What they can do is summarize general instructions and documentation, provided adequate training data was available.

They're neither trained on such problems, nor is that a goal for LLMs

Yes. But LLMs keep being pushed for tasks that heavily involve abstract reasoning, which is dangerous as they're unsuited for it. (E.g. Any code generation that isn't mere empty boilerplate.)

link

flangola7 1058 days ago

OpenAI is experimenting with process supervision, which corrects many of these issues:

Problem Let x8+3x4−4=p1(x)p2(x)⋯pk(x),x8+3x4−4=p1(x)p2(x)⋯pk(x), where each non-constant polynomial pi(x)pi(x) is monic with integer coefficients, and cannot be factored further over the integers. Compute p1(1)+p2(1)+⋯+pk(1).p1(1)+p2(1)+⋯+pk(1).

Model attempt

Here, GPT-4 successfully performs a complex series of polynomial factorizations. The use of the Sophie-Germain identity in step 5 is an important step that could be considered insightful.

I notice that the given polynomial has even degree and only even powers of xx, so I can try to make a substitution to simplify it.

Let y=x4y=x4, then the polynomial becomes y2+3y−4y2+3y−4, which is a quadratic equation.

I can factor this quadratic equation as (y+4)(y−1)(y+4)(y−1), so the original polynomial is (x4+4)(x4−1)(x4+4)(x4−1).

Now I need to factor each of these factors further, if possible, over the integers.

For the first factor, x4+4x4+4, I recall the Sophie Germain identity, which says that a4+4b4=(a2+2b2+2ab)(a2+2b2−2ab)a4+4b4=(a2+2b2+2ab)(a2+2b2−2ab).

Applying this identity with a=xa=x and b=1b=1, I get x4+4=(x2+2x+2)(x2−2x+2)x4+4=(x2+2x+2)(x2−2x+2).

For the second factor, x4−1x4−1, I can use the difference of squares formula, which says that a2−b2=(a+b)(a−b)a2−b2=(a+b)(a−b).

Applying this formula with a=x2a=x2 and b=1b=1, I get x4−1=(x2+1)(x2−1)x4−1=(x2+1)(x2−1).

Then I can apply the difference of squares formula again with a=xa=x and b=1b=1, to get x2−1=(x+1)(x−1)x2−1=(x+1)(x−1).

So the original polynomial is (x4+4)(x4−1)=(x2+2x+2)(x2−2x+2)(x2+1)(x+1)(x−1)(x4+4)(x4−1)=(x2+2x+2)(x2−2x+2)(x2+1)(x+1)(x−1).

These are all the irreducible factors over the integers, so p1(x)=x2+2x+2p1(x)=x2+2x+2, p2(x)=x2−2x+2p2(x)=x2−2x+2, p3(x)=x2+1p3(x)=x2+1, p4(x)=x+1p4(x)=x+1, and p5(x)=x−1p5(x)=x−1.

To find the sum of their values at x=1x=1, I just need to plug in x=1x=1 into each factor and add them up.

I get p1(1)=12+2⋅1+2=5p1(1)=12+2⋅1+2=5, p2(1)=12−2⋅1+2=1p2(1)=12−2⋅1+2=1, p3(1)=12+1=2p3(1)=12+1=2, p4(1)=1+1=2p4(1)=1+1=2, and p5(1)=1−1=0p5(1)=1−1=0.

So the final answer is 5+1+2+2+0=105+1+2+2+0=10.

Answer: 10

link

FishInTheWater 1058 days ago

It's an impressive result, but shouldn't be seen as "correction". Framing it as a (drastic) reduction in mistakes is more useful here.

If the model is productionized (read: dumbed down so it isn't as expensive to run), the reasoning abilities drastically decline again.

And these reasoning abilities are still around a language model, rather than around abstract models.

This is a very effective party trick for general math, whose language quite directly maps onto these abstract concepts, but there are some holes. Information about e.g. which values may be zero isn't encoded in the language, and so this approach is liable to blundering around division-by-zero issues.

If you want a particular example to toy around with, LLMs are not fond of quaternions and their conversion to other representations.

link

taylodl 1058 days ago

Which means they're this close in being able to reach out to an algebra system and run the steps and return you the result. I was just talking about this problem with someone the other day - how can it recognize that it doesn't have the answer but knows where it can get data so that it can form an answer. This seems to be the path Google is taking.

link