|
|
|
|
|
by tnecniv
899 days ago
|
|
So this is a book written by applied mathematicians for applied mathematics (they state in the preface it’s for scientists, but some theoretical scientists and engineers are essentially applied mathematics). As a result, both the topics and the presentation are biased towards those types of people. For example, I’ve never seen in practice worry about the existence and uniqueness conditions for their gradient-based optimization algorithm in deep learning. However, that’s the kind of result those people do care about and academic papers are written on the topic. The title does say that this is a book on the theoretical underpinnings of the subject, so I am not surprised that it is written this way. People also don’t necessarily read these books cover-to-cover, but drill into the few chapters that use techniques relevant to what they themselves are researching. There was a similarly verbose monograph I used to use in my research, but only about 20-30 pages had the meat I was interested in. This kind of book is more verbose than my liking both in terms of rigor and content. For example, they include Gronwall’s inequality as a lemma and prove it. The version that they use is a bit more general than the one I normally see, but Gronwall’s inequality is a very standard tool in analyzing ODEs and I have rigorous control theory books that state it without proof to avoid clutter (they do provide a reference to a proof). A lot of this verbosity comes about when your standard of proof is high and the assumptions you make are small. |
|
I suppose the goal would be to understand deep learning so that we know enough of what's going on but not to get stuck in math concepts that we probably don't know and won't use.