|
This is wrong - the training data is necessary but insufficient. There are a lot of other parts of the architectures used that add a lot of value - otherwise Markov chains would be all you need. There are layers upon layers with non linear activation functions, learned residuals, etc. They still absolutely must interpolate but the space they interpolate through is much more complex than the training data, and they can definitely create things not in their training data. What they can not do is wander outside their non linear parameter space’s convex hull. But this is a really permissive constraint on what they can do “creatively.” People generally under estimate the advantage the architectures confer on that constraint. This is why there was a step function change in expressive power as the architectures (attention, self attention, transformers, diffusions, others) evolved given the same training data. Generally though I challenge you to define “creative” in a way that is precise enough to measure and isn’t self referential or refer to concepts ill defined. The key tho is can they solve problems not easily solved before with prior techniques. Further can they identify problems not readily presented. Then identify novel solutions. Etc. The answer is emphatically yes they can. These features don’t have to literally exist in their training data, but the supporting highly convoluted network of associations of all their training data does have to in some complex space allow for it to produce these answers. It’s not the same as they’re stochastic parrots at all. Are they creative? No, because they don’t have awareness. My personal imprecise definition of creative requires both self and awareness as well as free will. There is no driving awareness in all AI architectures, it all derives from extrinsic impetus. Creativity is derived, IMO, from a layer of our minds that is not readily assessed or measured and is only indirectly expressed through language, art, and music. Hence it is not directly trainable and therefore a learning model can’t learn it by reinforcement. It can learn the proxies, but the proxies are not, as we all deeply know, the same as our experienced awareness. We are not our words, our art, our music. We try hard to bridge it, but it’s impossible and you and I know this to be true from experience. In fact we can not even examine our own awareness because it’s not directly observable or possible for us to directly reason about. This is core to a lot of philosophy, especially mid and far eastern philosophy of the mind, the self, the five aggregates of Buddhism, etc. Psychology points at it, and modern psychology avoids it because it’s practically difficult for outcome oriented treatments. |
While I have no hope for a rigorous definition (I don't think it's possible), there are two very distinct kinds of creativity:
1. Result is sufficiently novel for the system itself, i.e. it never seen it previously. This kind is too trivial to even talk about.
2. Result is novel for the side observer. This kind of creativity is meaningless because it depends on at least one unknown (side observer).