Hacker News new | ask | show | jobs
by g413n 119 days ago
yeah we actually had some wacky ideas with ctc + a reverse-causal mask but diffusion does just make it all a bit more simple