|
|
|
|
|
by Greenpants
810 days ago
|
|
You may be interested in the "binary step" activation function. This does what you're suggesting. In general, complex behaviour really takes a hit though using this for the activation function of a neuron (though I'm also not sure which papers show metrics on this being used for transformer models). |
|