Hacker News new | ask | show | jobs
by dist-epoch 2 hours ago
I am confused, every day I read on HN that AI's can just interpolate the data they have seen in training, and that they are structurally incapable of coming up with something new, creative and not in the training distribution.
5 comments

> I read on HN that AI's can just interpolate the data they have seen in training

No. That can be said about LLMs, but not about all forms of AI. The technique used is not a LLM.

Sadly we've bastardized the term AI that, if it ever meant anything, it's meaningless now. The currently most voted thread in this post discuses the topic.

This is wrong - the training data is necessary but insufficient. There are a lot of other parts of the architectures used that add a lot of value - otherwise Markov chains would be all you need. There are layers upon layers with non linear activation functions, learned residuals, etc. They still absolutely must interpolate but the space they interpolate through is much more complex than the training data, and they can definitely create things not in their training data. What they can not do is wander outside their non linear parameter space’s convex hull. But this is a really permissive constraint on what they can do “creatively.” People generally under estimate the advantage the architectures confer on that constraint. This is why there was a step function change in expressive power as the architectures (attention, self attention, transformers, diffusions, others) evolved given the same training data. Generally though I challenge you to define “creative” in a way that is precise enough to measure and isn’t self referential or refer to concepts ill defined.

The key tho is can they solve problems not easily solved before with prior techniques. Further can they identify problems not readily presented. Then identify novel solutions. Etc. The answer is emphatically yes they can. These features don’t have to literally exist in their training data, but the supporting highly convoluted network of associations of all their training data does have to in some complex space allow for it to produce these answers. It’s not the same as they’re stochastic parrots at all.

Are they creative? No, because they don’t have awareness. My personal imprecise definition of creative requires both self and awareness as well as free will. There is no driving awareness in all AI architectures, it all derives from extrinsic impetus. Creativity is derived, IMO, from a layer of our minds that is not readily assessed or measured and is only indirectly expressed through language, art, and music. Hence it is not directly trainable and therefore a learning model can’t learn it by reinforcement. It can learn the proxies, but the proxies are not, as we all deeply know, the same as our experienced awareness. We are not our words, our art, our music. We try hard to bridge it, but it’s impossible and you and I know this to be true from experience. In fact we can not even examine our own awareness because it’s not directly observable or possible for us to directly reason about. This is core to a lot of philosophy, especially mid and far eastern philosophy of the mind, the self, the five aggregates of Buddhism, etc. Psychology points at it, and modern psychology avoids it because it’s practically difficult for outcome oriented treatments.

This is analogies to finding a new prime number by brute force using existing maths, rather than inventing new maths to get there.

The AI in this case didn't create a novel technology- it merely used the existing technology without basing the new design on a previous one. The whole "human couldn't come up with it" is because the possible design space is so large, there's no reason a human would start where the AI did.

The thing the AI did better than humans was brute forcing a solution faster. Still a very handy thing to have, but it isn't "creating" in the sense that it invented new materials or fabrication processes or anything novel.

Have you read the article? The creative element came from the researchers:

> In our new approach, the architecture begins essentially from nothing and is progressively assembled through successive iterations. The system explores the design space by generating myriad candidate circuit combinations and mapping the resulting performance trade-offs as it navigates this landscape. Because the process is not biased by prior human design choices, it can produce completely novel circuit topologies that look markedly different from those created by human designers.

In my experience, if you tell them to research the web to see if their idea has been pursued before, you can get them to keep proposing new things until something is sufficiently new, even if it's a new interpolation between existing concepts, that it's effectively an original idea.