|
|
|
|
|
by uoaei
2243 days ago
|
|
I think there ought to be a distinction between explainable (what does this neuron activate most strongly on?) models and interpretable (what do the model's parameters tell me about the data?) models. The distinction is this: explanations can only be made ex post facto, about why the model acted a certain way based on specific inputs; interpretations can be made based on the model's parameters themselves, i.e., "feature X is very important and feature Y is almost always ignored and I know this because my NN is one layer deep and all the weights for feature X are large in magnitude and all the weights for feature Y are small in magnitude." This does not require specific inputs to be fed, and specific outputs to be studied, so is a different concept and why I am suggesting we make the distinction explicit. |
|
https://arxiv.org/pdf/1606.03490.pdf