Hacker News new | ask | show | jobs
by dafrdman 2116 days ago
Thanks so much for your feedback. Definitely open to comments!

I agree 100% that any use of packages can be intimidating for newbies. I experimented at first with creating the models without using numpy and I thought that it actually made it less clear rather than more clear. It's obviously a tradeoff--you see where everything comes from (rather than np.mysterious_function()) but you take 5 lines of code to do the same thing that a single numpy command could accomplish. I felt in the end that it distracted from the real purpose of the code, which is to demonstrate how the model works.

Do you think a compromise would be to add a section to the appendix introducing numpy? Introducing arrays, random instantiation, stuff like that? Otherwise I might consider adding a no-numpy version in the future.

Thanks so much for your feedback!

4 comments

Perhaps you could reduce the set of numpy functions used in your code to a minimal set (exp, sum, max, min, etc.) and then build fancier functions up from there. This affords you the speed and conciseness of using numpy arrays while limiting the abstractions that could obfuscate the inner workings of some of the fancier functions you might use (e.g. softmax).
I think there is a balance to be struck. You should totally use numpy for the arrays and basic math applications. But say on the first example you use `self.X.T` what does `.T` even do? Not asking you to go into all the details, just more comments saying this transposes the array, see numpy docs <link>. It will ease people into the library if they are unfamiliar with it. You do have some good ones like `column of ones` already, but more of those kinds of things.

I would also avoid using pandas if at all possible. Its just another thing people have to learn if they are unfamiliar.

I definitely agree. I should add more comments explaining what things like .T does--it's not that it's hard to grasp, but it might turn away newbies. Thanks for the suggestion!

Pandas is only used in the "code" sections, which use packages like scikit-learn anyway

That's a good point. If you had to explain how everything works without the libraries, you'd probably end up writing a book on how pandas and numpy works, not how ML works.

An appendix is a great idea!

Good ideas. I think I'll try to add an appendix, minimize the number of numpy functions used, and explain any of the weird ones that are real time savers. Thanks for all your thought.
I'd love to include the book to our company internal learning resources. Can you include an official license by any chance? Thank you
I hadn't even considered licensing it. Want to email me and we can talk? My email is dafrdman@gmail.com. That said, you're welcome to use it (though my lawyer father suggests I say that this "verbal contract" is revocable and non-exclusive).