Hacker News new | ask | show | jobs
by bcaine 1611 days ago
It's actually not linear, its a power law. That means we need exponentially more compute, data, and model parameters to see linear improvements in performance.