Hacker News new | ask | show | jobs
by 1vuio0pswjnm7 1064 days ago
"The idea is nearly 30 years old and has been used for large language models before, such as Google's Switch Transformer."

Innovation! :)