Hacker News new | ask | show | jobs
by godelski 697 days ago
> [3] file:///home/john/Downloads/Artificial_Intelligence_meets_natural_stupidity-2.pdf

I'm unsure if this is a meta joke or a great bit of irony.

> Anybody working on this?

If I understand your question accurately, yes. A more common example is people will ask GPT to answer via python code and then convert the python code into something else. But there are other people doing things more direct and through other methods. There are also people doing things like generating many answers, then performing search over those solutions (with or without GPT).

But regardless of, I think you should take care in calling out the "and then a miracle occurs"[0]. While the critique is well deserved, I think the context is dubious. It implies the same magic step is not necessary for LLMs. There's still a gap from where we are and getting to actual intelligence. LLMs are certainly impressive and have done a lot (something I think Gary ignores) but how to get to intelligence is still unknown and thus a missing middle step that "requires a miracle".

I don't think there is an issue in people pursuing neurosymbolics. In fact I would encourage it. Just as I'd encourage pursuing LLMs, category theory approaches, and others. The thing I would discourage is putting all our eggs in one basket when we recognize there is a missing step that we don't yet know how to solve. Allocate more resources to what's made the most improvements so far, but also not at the cost of recognizing limitations/criticisms. All technologies have limits and can be improved. It's the naive that reject critiques and the naive that are quick to dismiss. That's not science, that's politics.

[0] Variation: https://www.youtube.com/watch?v=a5ih_TQWqCA

2 comments

> It implies the same magic step is not necessary for LLMs.

LLMs can at least take in an informal problem statement and do something semi-useful with it. This is progress.

Some other internal representations are needed, especially for dealing with the real world. Pictures? Animations? Animation storyboards? Graphs? Programs? All of the above? Ten years ago it would have been a fantasy to consider pictures and animations as intermediate representations in an AI system. Not any more.

Automated storyboard generation is already a thing.

> This is progress.

I want to stress that not only do I agree with this, I explicitly stated so. I even explicitly said it would be naive to dismiss this progress AND explicitly criticized Gary for doing so. I want to make it abundantly clear that I am not claiming LLMs are not progress. I feel I have to do this because the context here and because it is common to conflate critiques of LLMs with dismissal of LLMs.

> Ten years ago it would have been a fantasy to consider pictures and animations as intermediate representations in an AI system.

I'm hesitant to agree. I'll agree if we are also saying that 10 years ago it would be considered fantasy to build a lossy compression of human written knowledge, build a natural language interface into it, and have this all under 200GB. In that I think someone could imagine a system but think it is far away and maybe even not believe the last condition. But this is a reasonably accurate description of LLMs (what's up for debate is the reasoning capacity, not the compression aspect).

And my point is not about technical capabilities. Like every other ML researcher, the release of GPT made me believe AGI was much closer than I had previously thought. But similar to many ML researchers, I later again reevaluated returning back to a similar position as I transitioned from seeing examples to have intimate experience with usage, deeply diving into the data they are trained on, into the training processes, and probing these machines.

For some people understanding the mechanics behind a "magic trick" makes the magic trick unimpressive. But I've always been fascinated and the mechanics often makes the tricks far more impressive! What GPT made everyone reconsider is how much we could do with data alone. How powerful and impressive our existing statistical frameworks are when scaled. But there is no evidence here that these systems actually understand what they are processing. There is no evidence that these systems are logically reasoning and there is a fair amount of evidence that they are not[0]. The details here matter and are the critical part of answering these questions. Because, as you mentioned, we've made a lot of progress. And the thing is that when we progress, the amount of complexity needed to further advance also increases. A low order approximation takes you a long way but we know complexity increases quickly to increase accuracy slowly.

I guess I would be more willing to believe a path to AGI argument with the systems if they were more robust (not to say I am not still impressed). I think even if the systems could perform the image generation tasks I described here[1] (see Imgur link), I do not believe this is enough to demonstrate intelligence or reasoning, alone. The types of errors made are not illustrative of a system that understands but it also is important to remember that proof is not symmetric (note: image generation is my specific research area). A billion positive examples do not constitute a proof while a single counter example constitutes a counter proof. My concern is that these discussions are often in the form of demonstration as proof. Demonstrations aren't proof and no amount of them will constitute proof. But it's also important to note that a counter proof is not always an _absolute_ rejection but often is more often bounding. What I'm saying is that the counter examples don't dismiss the utilities of LLMs but they do place strong bounds on where the utility lives. The distinction does matter, and a disregard of this distinction is specifically what I am criticizing Gary for.

[0] https://news.ycombinator.com/item?id=41097025

I want to also mention that the word "reasoning" is not always constant and that this unfortunately makes the discussion more convoluted. But I think we need to understand what is in the training data to accurately understand how to accurately test these abilities. Similarly the terms "out of distribution" and "zero/low shot" are changing and often not in great ways. E.g. it is common to train on LAION and "zero-shot" on ImageNet.

[1] https://news.ycombinator.com/item?id=41063312