|
|
|
|
|
by jstx1
1348 days ago
|
|
1. It's built into the task, not into the solution. How do you classify without binary outputs/probabilities? If you want to know if a picture contains someone's face or it doesn't, you need a binary result purely based on the task itself regardless of your approach for solving it. In multiclass classification, you extend your sigmoid to a softmax but it still boils down to a distribution of probabilities. Or in multilabel classification, you essentially perform binary classification for all classes at the same time. Like... what else could you do? In a hypothetical alternative, if you scan your face to unlock your phone, how does the underlying vision model give or deny access to the phone without producing a binary result at some point along the way? 2. To have nonlinearities between the layers, and to have layers with varying complexity and structure. In practice it works much better than all the alternatives that we've tried. These things are explained very well even at a beginner level, and you aren't really questioning them deeply or proposing any alternatives, instead you seem to be getting into philosophy. |
|
Who says? Who's defining the task?
I promise I'm not trying to get too philosophical here, but this is a long-standing issue in all areas of Science - the tendency towards reductionism.
Keep turning the dials on the oscilloscope to try and eliminate the signal noise...but what if the noise itself is an essential part of the phenomenon you're trying to study and understand? You see where I'm going with this?
> "In practice it works much better than all the alternatives that we've tried."
I was waiting for someone to just come out and say "It's the best we got". I'll grant that it might be true, but I don't like it and I don't accept it.