|
|
|
|
|
by qmatch
455 days ago
|
|
Need to read the details, but removing the norm can be big. It’s always a pain to make sure that your network is normalized properly when trying new architectures. Likely there will still be other implications of the tanh, since the norm is sometimes solving a conditioning problem, but IMO more alternatives are welcome |
|