Hacker News new | ask | show | jobs
by karpathy 3704 days ago
Sorry about that! There's a lot to cover for one blog post to do satisfyingly. I encourage you to check CS231n for a more thorough treatment where we also discuss, for example, the tradeoffs of different activation functions like tanh(), have a more gentle introduction on gradient descent, I devote a whole lecture to char rnn, assignment #1 (they are available) would demystify the backward pass, etc.

Also definitely +1 for not putting down people who write similar posts. I encourage everyone who is trying to learn to do it through blog posts because it lets you explain/organize thoughts. I also enjoy reading them quite a bit because it illustrates the kinds of conceptual problems beginners face (which is not at all obvious once you've been in the area for a few years). And it's also interesting to see many different interpretations of the same concepts, as everyone has different background and the way they reason through things is usually quite unique. Granted, this one could have been named something more appropriate!

1 comments

No need to apologize- I learned SO much from your blog, thank you. I didn't realize the course was online (https://www.youtube.com/watch?v=NfnWJUyUJYU). Also, looks like there's a subreddit for it as well: https://www.reddit.com/r/cs231n

It's really wonderful that all of this is freely available, thank you.

The lecture that covers gradient descent in the Youtube list you linked there is the first time gradient descent actually clicked for me, and I made it through the entire Andrew Ng Coursera ML course. Highly highly recommend it.
the video became private, anyone know the title of the video or is there another copy of it somewhere else?