| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by sgt101 2044 days ago
	An alternative is to re-lable data with the ensemble's outputs and then learn a decision tree over that.

2 comments

pete_b_condon 2044 days ago

That works to a point, but it doesn't necessarily find all the rules of the model. In the post I walked through a model with three training records (yellow, blue, red) which created six prediction boundaries. Half of the rules weren't covered by the training data, which makes them hard to find without an efficient algorithm to search out all possible rules. The risk of undiscovered rules is they may cause unexpected behaviour that leads to bad predictions - and if you haven't described the whole model then it will be impossible to know how many of these potentially bad predictions exist.

link

jononor 2044 days ago

Do you have any references/explainers for that approach? Would be interested to read!

link

sgt101 2043 days ago

The best I have is this one I wrote a long time ago:

https://www.aaai.org/Papers/Workshops/1999/WS-99-06/WS99-06-...

But, I apologize ! It's a bit pimped up compared to the one liner above, I think step 7 in section 4.3 is what I was thinking of :) I did laugh when I dug it out, as I have been working on the first bullet in the conclusion this week!

link

FlyingLawnmower 2044 days ago

Check out this work from Rich Caruana & collaborators on model compression: http://www.cs.cornell.edu/~caruana/compression.kdd06.pdf

which was a precursor to the model distallation work from Geoff Hinton: https://arxiv.org/abs/1503.02531

link

underaxon 2039 days ago

I've made some experiments that you can check out here:

http://www.clungu.com/Distilling_a_Random_Forest_with_a_sing...

link