| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by agibsonccc 3515 days ago

I have seen this from multiple angles.

I used to teach at a data science bootcamp where many of the students got hired by big companies.

I've also been running a deep learning startup for the last few years and have hired quite a few people.

Many of our team don't have phds but can still write backprop code for even complex modules like inception among other things. A lot of my students didn't have phds either.

A few of us (me included) are self taught. I've also coauthored the largest oreilly book on deep learning: http://shop.oreilly.com/product/0636920035343.do

1 piece of advice I would offer is building something that differentiates you from the rest. Many of these "medium thought pieces" you're talking about are actually very cool applications of deep learning. If you want to get hired for these kinds of roles, I would demonstrate you understand how to build things with deep learning. The litmus test I would also look for is "I trained a net from scratch and innovated in x way". Honestly, there's a rare amount of talent out there that can do well at software engineering as well as deep learning. I'm not convinced a phd is a hard requirement.

I get that recruiters at these larger companies definitely tend to look for the buzz words and often can't tell the difference so it's definitely harder going the traditional route.

Tech hiring also tends to be a networking thing as much as it is buzz word bingo no matter what field you're in. If you can network a bit and build something cool that demonstrates an understanding of deep learning I don't see the problem.

2 comments

imakecomments 3515 days ago

Regarding your book, have you expanded on the math section? I saw somewhere a draft of the material and the math review seemed to be broken up into short paragraphs. These short paragraphs lacked examples and appeared to assume previous background knowledge in the subject, which seems contradictory to the book's title and aim. For example.. I believe you mentioned somewhere "The Jacobian is a m x n matrix containing the 1st order partial derivatives of vectors with respect to vectors." -- Since I have a math background I can understand what you write. But for someone with little to no math background (e.g. a software practitioner) this may throw them off.

I am hesitant to recommend your book to a true practitioner due to the assumed knowledge presented within the math section. I think a better treatment of mathematics would assume the reader has little to no background but is intelligent enough to learn ground up the specific use cases of the mathematics for the deep learning techniques presented in the book. See: http://www.deeplearningbook.org/ for better treatment of the math review. It seems more thorough and makes less assumptions about the math background of the reader.

I would love to recommend your book to a practitioner but I'm afraid the math section (the version I reviewed) would scare them off/they would get little out of it.

link

Jugurtha 3514 days ago

>I believe you mentioned somewhere "The Jacobian is a m x n matrix containing the 1st order partial derivatives of vectors with respect to vectors." -- Since I have a math background I can understand what you write. But for someone with little to no math background (e.g. a software practitioner) this may throw them off.

This makes sense. However, there will always be requirements to understand any given topic. It is recursive and dangerous to assume otherwise because knowledge builds on previous knowledge. Knowledge gaps for requirements should be an exception handled by the reader, not by the author because it penalizes everyone who doesn't have that gap.

I understand the effort of authors wanting their books to be self contained and inclusive, bringing everyone up to speed, but this brings up awful college memories and students having to wait for the one person who doesn't know matrix multiplication asking a question in a class that is not about linear algebra. This person was the exception and instead of learning it on his own time, he was willing to penalize everyone.

Similarly, in the context of books, this is the reason 600 pages is the norm with the same first 400 pages "bringing everyone up to speed" (100 pages for a Python introduction, 70 pages for elementary linear algebra, etc).

The overlap is just staggering and it is safe to assume that a 600 pages book does not cost the same as a 200 pages book. In other words, everyone is paying the price for the one guy who wants to do the sexy Machine Learning/Deep Learning/Pattern Recognition, but doesn't want to bother looking up the Jacobian on his own. We're paying for the 400 pages we'll never read.

A large percentage of books caters to the beginner/neophyte knowing that being a beginner is a relatively short step for someone who has a long road ahead. There's an assumption of non-evolution/improvement, an everlasting tutorial 0. Imagine how frustrating it would be to have every item in the world being designed for crawling babies and disregarding the facts that they're on their way to be adults.

link

imakecomments 3514 days ago

Chapter 1 is completely trivial to anyone with mathematics training. It makes no difference to me if the author expands upon the section or not. It doesn't hurt my demographics of readership because we wouldn't read that section anyways. You know who would read it? The 'practitioner'. Someone that hasn't seen a matrix since high school or freshman year of college.

The interviews the authors give paint the picture this book is for the 'practitioner'. If Chapter 1 is meant for a brief review then don't advertise the book for a complete practitioner/beginner. Either make the book for the practitioner or not. If you do, then don't pretend to serve introductory math in it that the unfamiliarized reader will read and understand. They fail at their purpose there. So either make that chapter useful for the practitioner or leave it out and assume the mathematicians already know it. Maybe put it in an appendix and let us get to the meat quicker. It honestly does not take much time to define what a matrix is, give an example, define matrix multiplication, give examples etc. Same applies with basic definition and examples of derivatives. These are mindless mechanical procedures anyone can learn. It wouldn't take too much extra space to include some thoughtful examples. Maybe I should write an 'introductory group theory' textbook and start discussing geometric group theory 2 pages in if we want to get into not serving an intended audience's purpose.

I like what the author's are doing. I'm on their side, but I'm making suggestions that could serve a wider audience.

link

agibsonccc 3515 days ago

We have appendixes covering the basics there. I actually recommend deeplearningbook.org myself for a reference.

The book is meant to contain simple examples oriented towards engineers building applications rather than deriving backprop.

The book isnt called the definitive guide for a reason ;)

link

imakecomments 3515 days ago

I didn't notice the appendices in the draft. This may help, but the version I reviewed has a brief review of Linear Algebra and Statistics in Chapter 1. The language is written in a way that assumes mathematical familiarity. Putting this level of 'sophistication' in a "practitioners" book early on may turn off a certain demographic of readers. I say 'sophistication' because it's relative to the reader-- to someone with a math degree this section isn't sophisticated (in fact they'd find it completely trivial) and can be skipped. To a software engineer that hasn't taken a Linear Algebra/Statistics class in 10+ years this may appear too much for them. You risk losing the readers entirely or having them skip those sections. Again, this review is not friendly for the beginner and the 'practitioner' title is misleading here.

In my opinion it wouldn't be too difficult or much effort to define what these mathematical objects are and show basic examples with basic computations to solidify the concepts. The notion of gradient descendent & derivates (or partial derivatives) isn't that difficult to understand and could be easily explained in a page or less.

For example when you discuss the Outer Product:

"This is known as the “tensor product” of two input vectors. We take each element of a column vector and multiply it by all of the elements in a row vector creating a new row in the resultant matrix."

It would be nice for the beginner to see an example of this and as stated it wouldn't take much space in the book to provide one. I think these sort of things would differentiate your book from others. If you made it more friendly more 'practitioners' would be willing to read/use it end-to-end.

link

agibsonccc 3515 days ago

Thanks for the feedback! I'll share this with my editor and see what we can do.

link

globuous 3514 days ago

I don't know what my thoughts on the topic are worth since I haven't opened your book. But from parent, and as someone that does have a decent math background, what might be easiest/most effective could be a simple math appendix. Many book have those, to name a few:

-https://www.amazon.com/Paul-Wilmott-Quantitative-Finance-Set...

-https://www.amazon.com/Short-Course-General-Relativity/dp/03...

link

imakecomments 3514 days ago

Glad I could provide some feedback. I think what you're aiming to do is exceptional. Keep up the good work.

link

deepnotderp 3515 days ago

Yeah, I'm gonna piggy back on this comment. Deep Learning was really introduced to the public 4 years ago. That's not a lot of time....

link

agibsonccc 3515 days ago

It's been around for quite a long time though. Neural nets themselves have seen multiple hype cycles now. See the history of CIFAR.

I would maybe rephrase this as "Machine learning really just became mainstream recently and now everyone wants in".

If you are talking about say: recruiters, they will always tend to piggy back on buzz words. They don't really learn the technology themselves. Requiring a phd and some of these other things that are being talked about is a general "data science problem".

I can't count how many candidates I've seen applying to companies that got turned down for jobs because they just went through the traditional HR funnel. Your best bet as I said earlier is just to network.

The worst parts of getting a deep learning job are the same ones that plague every tech position out there.

link