Hacker News new | ask | show | jobs
by lostmsu 1063 days ago
:( did not have a patience to watch the entire video while focused. I know you could use rounding errors to replace non-linearity. Can somebody summarize what exactly he does for gradient descent here and how well does it work?
1 comments