Hacker News new | ask | show | jobs
by E-Reverance 54 days ago
They factorize the distribution in which they are trained on which is essentially generalization

https://arxiv.org/abs/2602.02385