Hacker News new | ask | show | jobs
by CapsAdmin 1289 days ago
When you say fine tuned do you mean fine tuned on an existing stable diffusion checkpoint? If so which?

It would be very interesting to see what the stable diffusion community that is using automatic1111 version would do with this if it were made into an extension.

1 comments

Yes from https://huggingface.co/runwayml/stable-diffusion-v1-5. Our checkpoint works with automatic1111, and if you'd like to make an extension to decode to audio, it should be pretty straightforward: https://github.com/hmartiro/riffusion-inference/blob/main/ri...
Can you run this on any hardware already capable of running SD 1.5? I am downloading the model right now, might play with this later.

Guessing at the speed with which AI is developing these days someone is going to have the extension up in two hours at most.

I bet the AUTOMATIC1111 web UI music plugin drops within 48 hours.
I have made a basic version here:

https://github.com/enlyth/sd-webui-riffusion

Yes! Although to have real time playback with our defaults you need to be able to generate 40 steps at 512x512 in under 5 seconds.
Good to know. I was just so close with just under 7s using 40 steps and Euler a as sampler.