Y
Hacker News
new
|
ask
|
show
|
jobs
by
senseiV
868 days ago
yes the size is different, but training a diffusion model and a language model are really different, like how RL models can be small but take a long time to train aswell