Hacker News new | ask | show | jobs
DiffusionBlocks: Training Neural Networks One Block at a Time (pub.sakana.ai)
4 points by sebg 19 days ago
1 comments

I do not understand.

how is this different from building smaller transformer layers, and each layer just denoises less?