Hacker News new | ask | show | jobs
by fblgit 51 days ago
one of a kind single-transformer block layer, high throughput. The new generation of transformer-based lightweight models for common NLP tasks?