| HN Mirror

Right. This paper ([0]) is actually mentioned in the DARPA BAA ([1]) as an example of a possible direction. A somewhat-similar scheme is [2]. Both seem to do some kind of sensitivity analysis, so as to show the user which parts of the input were most important for coming up with the decision: For instance, [2] "explains" an ML system (which answers questions about pictures), by telling you which pixels were most important for the decision. It does that by essentially hiding pixels and seeing how that influences the ML system's decisions.

So this produces not so much an explanation as "hints" as to why the system made the decision (still pretty useful). The BAA also mentions another possible direction ([3]), which is actually capable of making full-sentence explanations. For instance, it can explain the decisions of an image-to-wild-bird-name classifier with sentences like "This is a Laysan Albatross because this bird has a large wingspan, hooked yellow beak, and white belly”.

This sounds pretty impressive, but seems to depend on vocabulary provided by a user. As a result, in some cases the explanation provided may have nothing to do with how the classifier actually classified - see [4] for my interpretation of these issues and how they might perhaps be solved.

[0] https://arxiv.org/pdf/1602.04938v3.pdf

[1] https://www.fbo.gov/utils/view?id=ae0b129bca1080cc7c517e8dad...

[2] https://computing.ece.vt.edu/~ygoyal/papers/vqa-interpretabi...

[3] http://arxiv.org/pdf/1603.08507.pdf

[4] https://blog.foretellix.com/2016/08/31/machine-learning-veri...