Hacker News new | ask | show | jobs
by xigency 1048 days ago
Question, how long did it take to train this model and what hardware did you use?
1 comments

Took a lot of failed experiments, the model would keep converging to greyscale / sepia images. Think one of the ways I fixed was by adding an greyscale encoder to the arch. Used its output embedding as additional conditioning. Can't remember if I only added it to the Unet input or injected it during various stages of the unet down pass.
Think the final training run was only a couple hours on a Colab V100