So the glasses are always present when the bass/808s are hitting, so is there something that maps the sound to the images?
What is it about the algorithms that make the images 'dance' so quickly between the 3.5 beat and the 1? Is it because there are static risers that move so quickly through the wave spectrum?
Wait... is light skin mapped to when highs dominate and dark skin to lows?
I'm glad you like it! Actually compared to the linked post I don't do any manual latent-space representation selection. It's just a bit of "smart" signal processing. I've written a framework that makes it really simple to do these visualizations (not open-source yet). Here's one more example: https://www.youtube.com/watch?v=X4r4njUjE2M
It could just be my brain, but it seems like there is a loose correlation between the mouths in the video and the lyrics.
In Phantom Part II they mostly have their mouths closed. In La La Land it varies but the mouths are mostly open. If you focus on the mouth you'll get little mental radar blips where the mouth could be tracking what is being said.
You could follow me on twitter @tsmcalister. I'll post there once it's released. Depends a lot on how much time I have to work on it. Hopefully by the end of the month!
For some reason this made me yearn for a GAN that generates motorcycles riding through landscapes. Love the storytelling potential of your work, great stuff!
So the glasses are always present when the bass/808s are hitting, so is there something that maps the sound to the images?
What is it about the algorithms that make the images 'dance' so quickly between the 3.5 beat and the 1? Is it because there are static risers that move so quickly through the wave spectrum?
Wait... is light skin mapped to when highs dominate and dark skin to lows?