Y
Hacker News
new
|
ask
|
show
|
jobs
by
addandsubtract
262 days ago
Great work! Can this technique also be used to run image diffusion models on lower VRAM GPUs?
2 comments
GTP
261 days ago
Not an expert in machine learning, but AFAIK diffusion models use a completely different architecture, therefore you can't use the same code to run optimized versions of both. But maybe the core ideas can be adapted to diffusion somehow.
link
anuarsh
261 days ago
Thanks! I don't have much experience with diffusion models, but technically any multi-layer model could benefit from loading weights one by one
link