Hacker News new | ask | show | jobs
by mlm 2978 days ago
The rough idea is that you look at all the decisions made by the fraud model (sample 1 is fraud, sample 2 is not fraud) and the world of possible "predicates" ("feature 1 > x1", "feature 1 > x2", ..., "feature 10000 > z1," etc.) and try to find a collection of explanations (which are conjunctions of these predicates) that have high precision and recall over the fraud model's predictions. For example, if "feature X > X0 and feature Y < Y0" is true for 20% of all payments the fraud model thinks are fraudulent, and 95% of all payments matching those conditions are predicted by the fraud model to be fraud, that's a good "explanation" in terms of its recall and precision.

It's a little tough to talk about this in an HN comment but please feel free to shoot me an e-mail (mlm@stripe.com) if you'd like to talk more.