Hacker News new | ask | show | jobs
by richrichardsson 1382 days ago
The lstein fork [1] of the CompVis main repo is working on "Apple Silicon" based machines (and may work on Intel based too). It's not very fast though, ~3.5 minutes for 50 steps on my 16GB M1 Mini, whereas I understand that a 3080 can spit them out in the 30 second range. M<x> machines with higher GPU core count I would suppose are faster.

[1] https://github.com/lstein/stable-diffusion

1 comments

Intel Macs with AMD GPUs are supported by the MPS backend.
Hey you may not get this reply till much later but I'd love more info.

From my research in the last couple days, it only seems that PyTorch will work with AMD cards in combo with RocM, and RocM specifically isn't supporting older AMD gpus that you find on Mac laptops from just 2020.

Can you expand on what MPS is?

MPS is Metal Performance Shaders which is Apple's MacOS libraries for ML workloads. The MPS libraries are only on MacOS, but support both Apple Silicon and AMD GPUs. This means that on MacOS, you specify the 'mps' backend to pytorch as the device instead of 'cuda' or 'cpu', and MacOS runs operations on whatever GPU is available, be it an M1 or an M2 or an AMD GPU.

https://developer.apple.com/documentation/metalperformancesh...

Presumably they need > 10GB of VRAM though? I'm guessing my 2019 MBP with only 4GB is going to say no.
I don't have an AMD Mac to test on but on the Nvidia side of things there's support now for 4GiB cards with the right configuration, so it might be possible.