Hacker News new | ask | show | jobs
by tysam_and 1044 days ago
Generally biases in transformers don't work so well.

Personally I think it's because of the autoregressive, ODE-like nature of them, but who am I to say anything on that. ;PPPP