I commented on /r/machinelearning but the obvious disadvantage of this over the importance (mean impurity decrease) approach is that it won't handle non linear, non monotonic and multivariate effects as well. It is also still plagued by the same issues as importance including being "diluted" over highly correlated features etc.
Tree structure and rf interpretation work is really cool stuff but this is hardly state of the art or groundbreaking. There have been a number of papers out there that address the issues above by developing a more nuanced or augmented approach to importance, including Brieman's proposal for a local importance based on permutating each feature and recording the change in oob accuracy for each case.
There are a few things i feel i need to clear up here.
1) I'm not sure what you mean by being unable to handle nonmonotonic/multivariate effects. There is no issue with nonmonotonic effects, the sum of feature contributions is always how each tree actually predicts. Yes, interpretation would be somewhat harder, but can be solved by looking at feature value and/or distribution once you know its contribution.
2) Mean impurity decrease or Breimans feature permutation based method have almost no use in a setting i'm describing. They are both static measures in the sense they only apply to the model itself, and they will tell you nothing about a particular prediction on a data point or a set of predictions on the data set.
3) The issue with highly corelated features is indeed still there, but it is literally exactly the same problem that mean decrease impurity and Breimans method would face.
Tree structure and rf interpretation work is really cool stuff but this is hardly state of the art or groundbreaking. There have been a number of papers out there that address the issues above by developing a more nuanced or augmented approach to importance, including Brieman's proposal for a local importance based on permutating each feature and recording the change in oob accuracy for each case.