Hacker News new | ask | show | jobs
by qsort 943 days ago
As someone with a CS background myself, I don't think this is what GP was talking about.

Let's forget for a moment that stuff has to run on an actual machine. If you had to represent a quadratic equation, would you rather write:

(a) x^2 + 5x + 4 = 0

(b) the square of the variable plus five times the variable plus four equals zero

When you are trying to solve problems with a level of sophistication beyond the toy stuff you usually see in these threads, formal language is an aid rather than an impediment. The trajectory of every scientific field (math, physics, computer science, chemistry, even economics!) is away from natural language and towards formal language, even before computers, precisely for that reason.

We have lots of formal languages (general-purpose programming languages, logical languages like Prolog/Datalog/SQL, "regular" expressions, configuration languages, all kinds of DSLs...) because we have lots of problems, and we choose the representation of the problem that most suits our needs.

Unless you are assuming you have some kind of superintelligence that can automagically take care of everything you throw at it, natural language breaks down when your problem becomes wide enough or deep enough. In a way this is like people making Rube-Goldberg contraptions with Excel. 50% of my job is cleaning up that stuff.

5 comments

I quite agree and so would Wittgenstein, who (as I understand it) argued that precise language is essential to thought and reasoning[1]. I think one of the key things here is often what we think of as reasoning boils down to taking a problem in the real world and building a model of it using some precise language that we can then apply some set of known tools to deal with. Your example of a quadratic is perfect, because of course now I see (a) I know right away that it's an upwards-facing parabola with a line of symmetry at -5/2, that the roots are at -4 and -1 etc whereas if I saw (b) I would first have to write it down to get it in a proper form I could reason about.

I think this is a fundamental problem with the "chat" style of interaction with many of these models (that the language interface isn't the best way of representing any specific problem even if it's quite a useful compromise for problems in general). I think an intrinsic problem of this class of model is that they only have text generation to "hang computation off" meaning the "cognative ability" (if we can call it that) is very strongly related to how much text it's generating for a given problem which is why that technique of prompting using chain of thought generates much better results for many problems[2].

[1] Hence the famous payoff line "whereof we cannot speak, thereof we must remain silent"

[2] And I suspect why in general GPT-4 seems to have got a lot more verbose. It seems to be doing a lot of thinking out loud in my use, which gives better answers than if you ask it to be terse and just give the answer or to give the answer first and then the reasoning, both of which generally generate inferior answers in my experience and in the research eg https://arxiv.org/abs/2201.11903

> I quite agree and so would Wittgenstein

It depends on whether you ask him before or after he went camping -- but yeah, I was going for an early-Wittgenstein-esque "natural language makes it way too easy to say stuff that doesn't actually mean anything" (although my argument is much more limited).

> I think this is a fundamental problem with the "chat" style of interaction

The continuation of my argument would be that if the problem is effectively expressible in a formal language, then you likely have way better tools than LLMs to solve it. Tools that solve it every time, with perfect accuracy and near-optimal running time, and critically, tools that allow solutions to be composed arbitrarily.

Alpha Go and NNUE for computer chess, which are often cited for some reason as examples of this brave new science, would be completely worthless without "classical" tree search techniques straight out of the Russel-Norvig.

Hence my conclusion, contra what seems to be the popular opinion, is that these tools are potentially useful for some specific tasks, but make for very bad "universal" tools.

There are some domains that are in the twilight zone between language and deductive, formal reasoning. I've been into genealogy last year. It's very often deductive "detective work": say there are four women in a census with the same name and place that are listed on a birth certificate you're investigating. Which of them is it? You may rule one out on hard evidence (census suggests she would have been 70 when the birth would have happened), one on linked evidence (this one is the right age, but it's definitively the same one who died 5 years later and we know the child's mother didn't), one on combined softer evidence (she was in a fringe denomination and at the upper end of the age range) then you're left with one, etc.

Then as you collect more evidence you find that the age listed on the first one in the census was wildly off due to a transcription error and it's actually her.

You'd think some sort of rule-based system and database might help with these sorts of things. But the historical experience of expert system is that you then often automate the easy bits at the cost of demanding even more tedious data-entry. And you can't divorce data entry and deduction from each other either, because without context, good luck reading out a rare last name in the faded ink of some priest's messy gothic handwriting.

It feels like language models should be able to help. But they can't, yet. And it fundamentally isn't because they suck at grade school math.

Even linguistics, not something I know much about but another discipline where you try to make deductions from tons and tons of soft and vague evidence - you'd think language models, able to produce fluent text in more languages than any human, might be of use there. But no, it's the same thing: it can't actually combine common sense soft reasoning and formal rule-oriented reasoning very well.

> You'd think some sort of rule-based system and database might help with these sorts of things.

sounds like belief change systems (a bit) to me!

https://plato.stanford.edu/entries/logic-belief-revision/

I assumed seanhunter was suggesting getting the LLM to convert x^2 + 5x + 4 = 0 to a short bit of source code to solve for x.

IIRC Wolfram Alpha has (or had, hard to keep up) a way to connect with ChatGPT.

It does. This is the plugins methodology described in the toolformers paper which I've linked elsewhere[1]. The model learns that for certain types of problems certain specific "tools" are the best way to solve the problem. The problem is of course it's simple to argue that the LLM learns to use the tool(s) and can't reason itself about the underlying problem. The question boils down to whether you're more interested in machines which can think (whatever that means) or having a super-powered co-pilot which can help with a wide variety of tasks. I'm quite biased towards the second so I have the wolfram alpha plugin enabled in my chat gpt. I can't say it solves all the math-related hallucinations I see but I might not be using it right.

[1] But here it is again https://arxiv.org/abs/2302.04761

GPT4 does even without explicitly enabling plugins now, by constructing Python. If you want it to actually reason through it, you now need to ask it, sometimes fairly forcefully/in detail, before it will indulge you and not omit steps. E.g. see [1] for the problem given above.

But as I noted elsewhere, training its ability to do it from scratch matters not for the ability to do it from scratch, but for the transferability of the reasoning ability. And so I think that while it's a good choice for OpenAI to make it automatically pick more effective strategies to give the answer it's asked for, there is good reason for us to still dig into its ability to solve these problems "from scratch".

[1] https://chat.openai.com/share/694251c9-345b-4433-a856-7c38c5...

Ideally we'd have both worlds -- but if we're aiming for AGI and we have to choose, using a language that lets you encode everything seems preferable to one that only lets you talk about, say, constrained maximization problems.
the ml method doesnt require you to know how to solve the problem at all, and could someday presumably develop novel solutions. not just high efficiency symbolic graph search.