Hacker News new | ask | show | jobs
by rajansaini 1847 days ago
Those are very interesting empirical results. This lecture explains the deeper vs shallow tradeoff theoretically: https://www.youtube.com/watch?v=qpuLxXrHQB4. He's an amazing lecturer; wish I didn't need subtitles!

(If you're too lazy to watch, it turns out that there exist functions that a shallow network can never approximate)