|
|
|
|
|
by thatguysaguy
239 days ago
|
|
Back when BERT came out, everyone was trying to get it to generate text. These attempts generally didn't work, here's one for reference though: https://arxiv.org/abs/1902.04094 This doesn't have an explicit diffusion tie in, but Savinov et al. at DeepMind figured out that doing two steps at training time and randomizing the masking probability is enough to get it to work reasonably well. |
|
https://joecooper.me/blog/crosstalk/
I’ve still got a few ideas to try though so I’m not done having fun with it.