| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by Legend2440 1040 days ago

Because with infinite hardware I'd be able to do neural architecture search and find the optimal model architecture.

And I'd be able to train a learned optimizer to replace gradient descent as the training process.

Even without either of those, performance improves in a predictable way with more compute thanks to scaling laws.