| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by TeMPOraL 543 days ago

> If I write a book that contains Einstein's theory of relativity by virtue of me copying it, did I create the theory? Did my copying of it indicate anything about my understanding of it? Would you be justified to think the next book I write would have anything of original value?

No, but you described a `cp` command, not an LLM.

"Creativity" in the sense of coming up with something new is trivial to implement in computers, and has long been solved. Take some pattern - of words, of data, of thought. Perturb it randomly. Done. That's creativity.

The part that makes "creativity" in the sense we normally understand it hard, isn't the search for new ideas - it's evaluation of those ideas. For an idea to be considered creative, it has to match a very complex... wait for it... pattern.

That pattern - what we call "creative" - has no strict definition. The idea has to be close enough to something we know, so we can frame it, yet different enough from it as to not be obvious, but still not too different, so we can still comprehend it. It has to make sense in relevant context - e.g. a creative mathematical proof has to still be correct (or a creative approach to proving a theorem has to plausibly look like it could possibly work); creative writing still has to be readable, etc.

The core of creativity is this unspecified pattern that things we consider "creative" match. And it so happens that things matching this pattern are a match for pattern "what makes sense for a human to read" in situations where a creative solution is called for. And the latter pattern - "response has to be sensible to a human" - is exactly what the LLM goal function is.

Thus follows that real creativity is part of what LLMs are being optimized for :).

3 comments

sleepytree 542 days ago

> For an idea to be considered creative, it has to match a very complex... wait for it... pattern.

If we could predefine what would count as creativity as some specific pattern, then I'm not sure that would be what I would call creative, and certainly wouldn't be an all-inclusive definition of creativity. Nor is creativity merely creating something new by perturbing data randomly as you mentioned above.

While LLMs might be capable of some forms of creativity depending on how you define it, I think it remains to be seen how LLMs' current architecture could on its own accomplish the kinds of creativity implicit in scientific progress in the Kuhnian sense of a paradigm shift or in what some describe as a leap of artistic inspiration. Both of these examples highlight the degree to which creativity could be considered both progress in an objective sense but also be something that is not entirely foreshadowed by its precursors or patterns of existing data.

I think there are many senses in which LLMs are not demonstrating creativity in a way that humans can. I'm not sure how an LLM itself could create something new and valuable if it requires predefining an existing pattern which seems to presuppose that we already have the creation in a sense.

link

TeMPOraL 542 days ago

My take on Kuhn's paradigm shift is that it's still incremental progress, but the shift happens at a meta level. I.e., for the scientific example, you need some accumulated amount of observations and hypotheses before the paradigm shift can happen, and while the science "before" and "after" may look hugely different, it's still the case that the insight causing the shift is still incremental. In the periods before paradigm shifts, the science didn't stay still, waiting for a lone genius to make a big conceptual leap that randomly happened to hit paydirt -- if we could do such probability-defying miracles, we'd have special relativity figured out by Ancient Greeks. No, the science just kept accumulating observations and insights, narrowing down the search space until someone (usually several someones around the world, at the same time) was in the right place and context to see the next step and take it.

This kind of follows from the fact that, even if the paradigm-shifting insight was caused by some miracle feat of a unique superhuman genius, it still wouldn't shift anything until everyone else in the field was able to verify the genius was right, that they found the right answer, as oppose to a billion different possible wrong answers. To do that, the entire field had to have accumulated enough empirical evidence and theoretical understanding to already be within one or two "regular smart scholar" leaps from that insight.

With art, I have less experience, but my gut instinct tells me that even there, "artistic inspiration" can be too big a leap from what was before, as otherwise other people would not recognize or appreciate it. Also, unlike science, the definition of "art" is self-referential: art is what people recognize as art.

Still, I think you make a good point here, and convinced me that potential for creativity of LLMs, in their current architecture, is limited and below that of humans. You said:

> While LLMs might be capable of some forms of creativity depending on how you define it, I think it remains to be seen how LLMs' current architecture could on its own accomplish the kinds of creativity implicit in scientific progress in the Kuhnian sense of a paradigm shift or in what some describe as a leap of artistic inspiration.

I think the limit stems strictly from LLMs being trained off-line. I believe LLMs could go as far as making the paradigm-shifting "Kuhnian leap", but they wouldn't be able to increment on it further. Compared to humans, LLMs are all "system 1" and almost none "system 2" - they rely on "intuition"[0], which heavily biases them towards things they've learned before. In a wake of a paradigm shift, a human can make themselves gradually unlearn their own intuitions. LLM's can't, without being retrained. Because of that, the forms of creativity that involve making a paradigm-shifting leap and making a few steps forward from it are not within reach of any current model.

[0] - LLMs basically output things that seem most likely given what came before; I think this is the same phenomenon as when humans think and say what "feels like best" in context. However, we can pause and override this; LLMs can't, because they're just run in a forward pass - they neither have an internal loop, nor are they trained for the ability to control an external one.

link

corimaith 542 days ago

>Creativity" in the sense of coming up with something new is trivial to implement in computers, and has long been solved. Take some pattern - of words, of data, of thought. Perturb it randomly. Done. That's creativity.

Formal Proof Systems aren't even nearly close to completion, and for patterns we don't have a strong enough formal system to fully represent the problem space.

If we take the P=NP problem, that likely can be solved formally that a machine could do, but what is the "pattern" here that we are traversing here? There is a definitely a deeper superstructure behind these problems, but we can only glean the tips, and I don't think the LLMs with statistical techniques can glean further in either. Natural Language is not sufficient.

link

TeMPOraL 542 days ago

Whatever the underlying "real" pattern is, doesn't really matter. We don't need to represent it. People learn to understand it implicitly, without ever seeing some formal definition spelled out - and learn it well enough that if you take M works to classify as "creative" or "not", then pick N people at random and ask each of them to classify each of the works, you can expect high degree of agreement.

LLMs aren't leaning what "creativity" is from first principles. They're learning it indirectly, by being trained to reply like a person would, literally, in the fully general meaning of that phrase. The better they get at that in general, the better they get at the (strict) subtask of "judging whether a work is creative the same way a human would" - and also "producing creative output like a human would".

Will that be enough to fully nail down what creativity is formally? Maybe, maybe not. On the one hand, LLMs don't "know" any more than we do, because whatever the pattern they learn, it's as implicit in their weights as it is for us. On the other hand, we can observe the models as they learn and infer, and poke at their weights, and do all kinds of other things that we can't do to ourselves, in order to find and understand how the "deeper superstructure behind these problems" gets translated into abstract structures within the model. This stands a chance to teach us a lot about both "these problems" and ourselves.

EDIT:

One could say there's no a priori reason why those ML models should have any structural similarity to how human brains work. But I'd say there is a reason - we're training them on inputs highly correlated with our own thoughts, and continuously optimizing them not just to mimic people, but to be bug for bug compatible with them. In the limit, the result of this pressure has to be equivalent to our own minds, even if not structurally equivalent. Of course the open question is, how far can we continue this process :).

link

sleepytree 542 days ago

As far as I can tell, I think you are interchanging the ability to recognize creativity with the ability to be creative. Humans seem to have the ability to make creative works or ideas that are not entirely derivative from a given data set or fit the criteria of some pre-existing pattern.

That is why I mentioned Kuhn and paradigm shifts. The architecture of LLMs do not seem capable of making lateral moves or sublations that are by definition not derivative or reducible to its prior circumstance, yet humans do, even though the exact way we do so is pretty mysterious and wrapped up in the difficulties in understanding consciousness.

To claim LLMs can or will equal human creativity seems to imply we can clearly define not only what creativity is, but also consciousness and also how to make a machine that can somehow do both. Humans can be creative prima facie, but to think we can also make a computer do the same thing probably means you have an inadequate definition of creativity.

link

TeMPOraL 542 days ago

I wrote a long response wrt. Kuhn under your earlier comment, but to summarize it here: I believe LLMs can make lateral moves, but they will find it hard to increment on them. That is, they can make a paradigm-shifting creative leap itself, but they can't then unlearn the old paradigm on the spot - their fixed training is an attractor that'll keep pulling them back.

As for:

> As far as I can tell, I think you are interchanging the ability to recognize creativity with the ability to be creative.

I kind of am, because I believe that the two are intertwined. I.e. "creativity" isn't merely an ability to make large conceptual leaps, or "lateral moves" - it's the ability to make a subset of those moves that will be recognized by others as creative, as opposed to recognized as wrong, or recognized as insane, or recognized as incomprehensible.

This might apply more to art than science, since the former is a moving target - art is ultimately about matching subjective perceptions of people, where science is about matching objective reality. A "too creative" leap in science can still be recognized as "creative" later if it's actually correct. With art, whether "too creative" will be eventually accepted or forever considered absurd, is unpredictable. Which is to say, maybe we should not treat these two types of "creativity" as the same thing in the first place.

link

daveguy 542 days ago

> Take some pattern - of words, of data, of thought. Perturb it randomly. Done. That's creativity.

This seems a miopic view of creativity. I think leaving out the pursuit of the implications of that perturbation is leaving out the majority of creativity. A random number generator is not creative without some way to explore the impact of the random number. This is something that LLM inference models just don't do. Feeding previous output into the context of a next "reasoning" step still depends on a static model at the core.

link