|
|
|
|
|
by tyrael71
3613 days ago
|
|
'It's also worth checking out existing neural net code-bases to see what tricks they have. The fine details usually aren't in papers, and they're not all in the text-books either.' Given that you are a person who is highly-qualified to answer, I am genuinely curious why do you think that is? Reimplementing algorithms from scratch is an efficient way to learn, understand the underlying concepts and attempt improvements in a research context. |
|
That said, there are also whole papers, even collected volumes, on initialization and other practical details.
Textbooks aren't always up-to-date with the latest practical knowledge, as deep-learning practice is moving quickly. Or they simply don't want to clutter their high-level maths descriptions with code-level implementation details. Teaching stuff is all about tradeoffs. I'm sure several books do mention the scale of weights for simple feed-forward weights though, as it's not an implementation-level detail, and it's probably been well known since the 1980s.