|
|
|
|
|
by GaggiX
1 day ago
|
|
Well with a standard autoregressive model you can generate for example 256 tokens at once if you have 256 users, with this approach you can generate 256 tokens for a single user but you need several forward steps. So the diffusion process takes more GFLOPs, if you have enough users you can already balance memory and compute. |
|