|
|
|
|
|
by rsfern
1219 days ago
|
|
For what it’s worth, I usually default to swish activations, which seem to be popular in my corner of graph neural nets (materials and chemistry). Performance is about the same as ReLU, and I like swish because it doesn’t have a hard discontinuity. |
|