|
|
|
|
|
by boreas
2062 days ago
|
|
Hope this gets a response - I don't see a paper. If there is a paired dataset, I guess this could be "easy". The stylegan "input" is essentially used to control parameters at various stages within the network, so you could adjust them one at a time, or on varying schedules or something, to get the sort of gradual effect. I know that randomly instantiated neural networks can produce some pretty trippy image transformations as well, so maybe there is a way to bootstrap without paired data. Lastly, I doubt this is on the right track but it would be cool if you could produce appropriately styled images with a "compression" approach. ie, trying to fit the audio information into some small visually meaningful latent space, and then using that to generate images. edit: ok just watched the first example, back to square 1 for me. Its literally pulling stuff from paintings. |
|
My co-founder has a YouTube channel where he will be going into technical detail how this all works. He will be posting a video in the coming weeks. https://www.youtube.com/channel/UCNIkB2IeJ-6AmZv7bQ1oBYg