Hacker News new | ask | show | jobs
by axoltl 311 days ago
This isn't a transformer, it's a diffusion model. You can't split diffusion models across compute nodes.