Hacker News new | ask | show | jobs
by pavlov 3516 days ago
To translate that kind of conceptual aesthetic logic into an algorithm, the programmer essentially needs to become the artist: make subjective creative decisions about the style to achieve, and enshrine those into code. And (as dmreedy wrote in a sibling comment) that's specifically the kind of "old-school" AI approach the current DNN-based work is trying to avoid.

I'm not as optimistic as you that the current statistics-driven approaches could ever reach the kind of deep analytic modeling that would be required for a style transfer system to be able to look at a Picasso and infer that there's a 3D->2D mapping at play... And it's a very interesting thought because (to me) it seems to demonstrate how far we are from actual AI that could make that kind of inventive conceptual leap.

1 comments

What data does an artist consider when he paints? He does sort of an optimization procedure very similar to what something like Deep Dream does. But rather than doing response optimization to make random-noise more "dog-like" or "cat-like" or "human-like" (as Deep Dream does) the optimization is done to evoke a certain feeling within the artist himself. To create more extreme feelings than just a photo-realistic rendering.

The mapping between feeling and images are correlated to each other through experience. Certain images are fundamental to human experience and the human brain through evolution( a mother smiling, scary monsters). Others are learned (ever been hit by a car? bet that every time you see that exact model and color of car you'll feel an emotion)

Here's a thought experiment:

What if we fed the deep learning "painter" tons of 3D animation. Each point in time would be a full 3D Scene. Each point in time would be labelled with emotions "scary", "happy" , "angry"

I bet the algorithm could generate original art and learn new artistic styles by maximizing response to certain permutations of feelings.

Learning from video is researched in many papers. The way video data is structured, it allows for identification of new objects by comparing consecutive frames. It creates a "model of the physical world" that can predict the future a few time steps ahead. It is being used to identify activities and to help plan robotic movements.