|
|
|
|
|
by jimfleming
2450 days ago
|
|
More anecdata: we consistently outperform lightgbm, xgboost, random forests, linear models, etc. using neural networks even on smaller datasets. This applies whether we implemented the other algorithms ourselves or simply compared to someone else’s results with them. In my experience it really comes down to how many “tricks” you know for each algorithm and how well can you apply and combine these “tricks”. The difference is that neural networks have many more of these tricks and a broader coverage of research detailing the interactions between them. I call them “tricks” but really they’re just design decisions based on what current research indicates about certain problems. This is largely where the “art” part of neural networks comes from that many people refer. The search space is simply too big to try everything and hope for the best. Therefore, how a problem is approached and how solutions are narrowed and applied really matter. Even simple things like which optimizer you use, how you leverage learning rate schedules, how the loss function is formulated, how weights are updated, feature engineering (often neglected in neural networks), and architectural priors make a big difference on both sample efficiency and overall performance. Most people, if they’re not just fine-tuning an existing model, simply load up a neural network framework, stack some layers together and throw data at it expecting better results than other approaches. But there’s a huge spectrum from that naive approach to architecting a custom model. This is why neural networks are so powerful and why we tend to favor it (though not for every problem). It’s much easier to design a model from the ground up with neural networks than it is for e.g. xgboost because not only are the components more easily composable thanks to the available frameworks but there’s a ton more research on the specific interactions between those components. That doesn’t mean than every problem is appropriate for neural networks. I completely agree with you that no matter what the problem is you should never jump to an approach just because its popular. Neural networks are a tool and for many problems you need to be comfortable with every one of those decision points to get the best results and even if you’re comfortable it can take time and that isn’t always appropriate for every problem. My other point is that I wouldn’t draw too many conclusions about a particular algorithm being better or worse than another. I’m not saying that was the intention with your comment but I know many people in the ML industry tend to take a similar position. It really depends on current experience with the applied algorithms, not just experience with ML in general. |
|
I particularly like this:
> In my experience it really comes down to how many “tricks” you know for each algorithm and how well can you apply and combine these “tricks”. The difference is that neural networks have many more of these tricks and a broader coverage of research detailing the interactions between them.
This is pretty true - the lack of knobs to turn on something like XGBoost or LightGBM both make it pretty easy to get good results and harder to fine tune results for your specific problem. Maybe this isn't the most correct way to look at it, but I've always sort of pictured it as curve where you are plotting effort vs results, and the one for LightGBM/XGBoost starts out higher but is more flat, and the NN one is steeper.
I guess reading your post makes me wonder where the two curves cross? Do you have good intuition for that, or do you feel so comfortable with neural networks that they are sort of your default? I peeked at the company you have listed in your bio, and it looks like you have pretty deep experience with neural networks and work with other people who have been in research roles in that area too, and I wonder how that changes your curve compared to the average ML practitioner? Certainly figuring out how to pick the best layer combinations, optimizer, loss functions, etc benefits hugely from intuition gained over years of experience.