Hacker News new | ask | show | jobs
by namibj 807 days ago
Sparse Universal Transformer is older and already did routing-based early termination...