Y
Hacker News
new
|
ask
|
show
|
jobs
by
riley_s8
589 days ago
totally agree. It doesn't make any sense to use linear(softmax(linear(x))) to replace linear(x) while claiming to be more explainable and more scalable.