|
|
|
|
|
by pete_b_condon
2044 days ago
|
|
That's a very good question. You're right that the way we typically train Tree Ensembles creates a massive number of rules, the walk through Random Forest has more than 100,000 leaves per Decision Tree. Once we start grafting it the number of rules starts to vastly outnumber the amount of training data. I have some follow up articles planned that will cover this in more detail, but the short answer is that I feel that we often jump to overly complex models up front without fully considering whether the accuracy/complexity tradeoff it worth it. Using Amalgamate I showed how I could have the number of rules without significantly increasing validation error (+5%). I believe that if we're careful using model sophisticated techniques (i.e. Boosting and dense/fully connected/tabular neural networks) then we should be able to create reasonably accurate models that are reasonably straight forward to explain. |
|