Hacker News new | ask | show | jobs
by marcinzm 1069 days ago
The whole argument is that at large scales with billions of params it doesn’t matter specifically because of those billions so giving a toy example seems to miss the point.