Hacker News new | ask | show | jobs
by trick-or-treat 91 days ago
Can you back that up with some logic for me?

I don't really play Go but I play chess, and it seems to me that most of what humans consider creativity in GM level play comes not in prep (studying opening lines/training) but in novel lines in real games (at inference time?). But that creativity absolutely comes from recalling patterns, which is exactly what OP criticizes as not creative(?!)

I guess I'm just having trouble finding a way to move the goalpost away from artificial creativity that doesn't also move it away from human creativity?

1 comments

How a model is trained is different than how a model is constructed. A model’s construction defines its fundamental limitations, e.g. a linear regressor will never be able to provide meaningful inference on exponential data. Depending on how you train it, though, you can get such a model to provide acceptable results in some scenarios.

Mixing the two (training and construction) is rhetorically convenient (anthropomorphization), but holds us back in critically assessing a model’s capabilities.

Linear regression has well characterized mathematical properties. But we don't know the computational limits of stacked transformers. And so declaring what LLMs can't do is wildly premature.
> And so declaring what LLMs can't do is wildly premature.

The opposite is true as well. Emergent complexity isn’t limitless. Just like early physicists tried to explain the emergent complexity of the universe through experimentation and theory, so should we try to explain the emergent complexity of LLMs through experimentation and theory.

Specifically not pseudoscience, though.

>so should we try to explain the emergent complexity of LLMs through experimentation and theory.

Physicists had the real world to verify theories and explanations against.

So far anyone 'explaining the emergent complexity of LLMs through experimentation and theory' is essentially just making stuff up nobody can verify.

Well that’s why I provided the caveat “specifically not pseudoscience”, which is, as you described, “just making stuff up nobody can verify”.
If you say not pseudoscience and then make up pseudoscience anyway then what's the point? The field has not advanced anywhere enough in understanding for convoluted explanations about how LLMs can never do x to be anything but pseudoscience.
Sure, that's true as well. But I don't see this as a substantive response given that the only people making unsupported claims in this thread are those trying to deflate LLM capabilities.
So, to review this thread

  - OP asked for someone to make a logical argument for the separation of “training” from “model”
  - I made the argument
  - You cherry picked an argument against my specific example and made an appeal to emergent complexity
  - I pointed out that emergent complexity isn’t limitless
  - “the only people making unsupported claims in this thread are those trying to deflate LLM capabilities”
You made a pretty nonsensical argument, pretty much seems like the big standard for these arguments.

What does linear regression have to do with the limitations of a stacked transfer ? Absolutely nothing. This is the problem here. You don't know shit and just make up whatever. You can see people doing the same thing in GPT-1, 2, 3, 4 threads all telling us why LLMs will never be able to do thing it manages to do later.

To be clear, you are confusing me with other commenters in this thread. All I want is for those that liken LLMs to stochastic parrots and other deflationary claims to offer an argument that engages with the actual structure of LLMs and what we know about them. No one seems to be up to that challenge. But then I can't help but wonder where people's confident claims come from. I'm just tired of the half-baked claims and generic handwavy allusions that do nothing but short-circuit the potential for genuine insight.