|
|
|
|
|
by tdj
2718 days ago
|
|
Their hypercube covering formalism can be seen as decision tree induction with a specific partitioning rule, and terminating branching only at uniformly labeled leaves. But try are using the tree nodes as kind of an embedding to apply a softmax on. I like the connection between relus and the geometrical representation, makes it easier to think about in spatial terms. Reading this I got several dejavus to my grad school classes on classical ML stuff. I like the direction but it feels like it could be better if it admitted that it's a variant of decision tree embedding, and built on some of the massive amount of research work in that area. At least in terms of understanding. I suspect doing a random forest version of this would actually help. Perhaps we will see this as a legit pre-training step. |
|
Also called Perfect Decision Tree.