Hacker News new | ask | show | jobs
by siboehm 1466 days ago
I built this decision tree (LightGBM) compiler last summer: https://github.com/siboehm/lleaves

It get's you ~10x speedups for batch predictions, more if your model is big. It's not complicated, it ended up being <1K lines of Python code. I heard a couple of stories like yours, where people had multi-node spark clusters running LightGBM, and it always amused me because by if you compiled the trees instead you could get rid of the whole cluster.

1 comments

Wow, very interesting, thanks for this. Daily batch predictions is all we do. I’m the maintainer of miceforest[1], do you think this would integrate well into the package at a brief glance? I’m always looking for ways to make this package faster.

[1] https://github.com/AnotherSamWilson/miceforest

I had a brief look at your package, and my impression was that it's only changing model training. If this is correct then the format of the model.txt (calling `lgbm.save(model, "model.txt")`) is the same as regular lightgbm. This would mean you can use my library for inference.