Hacker News new | ask | show | jobs
by McIceT 2430 days ago
I'm glad you like it! Actually compared to the linked post I don't do any manual latent-space representation selection. It's just a bit of "smart" signal processing. I've written a framework that makes it really simple to do these visualizations (not open-source yet). Here's one more example: https://www.youtube.com/watch?v=X4r4njUjE2M
4 comments

It could just be my brain, but it seems like there is a loose correlation between the mouths in the video and the lyrics.

In Phantom Part II they mostly have their mouths closed. In La La Land it varies but the mouths are mostly open. If you focus on the mouth you'll get little mental radar blips where the mouth could be tracking what is being said.

"Pretty girl and you let go" - https://youtu.be/52qWiLoOeIQ?t=18

"If you wanna waste time baby" - https://youtu.be/52qWiLoOeIQ?t=44

"Yeah i met her at a one oak" - https://youtu.be/52qWiLoOeIQ?t=84 (esp the one oak part)

Anyone that actually watches these will probably just think I'm low on sleep, but it's kind of interesting.

Jesus holy fuck that's SUPER IMPRESSIVE

I'd be delighted to learn how you translate musical structure to a point in latent space!

Cool stuff! Would be interested in knowing if and when this library becomes open source. Where could I follow you?
You could follow me on twitter @tsmcalister. I'll post there once it's released. Depends a lot on how much time I have to work on it. Hopefully by the end of the month!
Looking forward to when you release this, would love to try this with other models!