Hacker News new | ask | show | jobs
by kir-gadjello 1191 days ago
If you have questions about my rationale for this or that technique included in the list, please, ask!

For example, I think Google's paper "Sparse is enough for scaling transformers" was very underrated, as it provided more than an order of magnitude improvement for inference economy, and it included one OpenAI researcher among authors.