Y
Hacker News
new
|
ask
|
show
|
jobs
by
supple-mints
717 days ago
Is it harder to train the wider network or the deeper network all else equal?
1 comments
vatsadev
717 days ago
Post author here, if you look at MFU, then the wider layers win out, and init takes much longer the more you add layer
link