Hacker News new | ask | show | jobs
by FreakLegion 1320 days ago
Tsk to whoever downvoted this. Simple linear models are indeed the right starting point for most new projects while you come to grips with your data.

In some cases you can stop there or apply a quick nonlinearization like Fastfood to get good, snappy, and generally debuggable results for very little RAM.

In other cases you move on to decision tree ensembles or neural networks, depending on whether you already have features or need those to be learned, too. Either way this ratchets up the complexity and resource requirements.

Decision trees in particular tend to have bloated implementations. I still use XGBoost or Scikit for training, but wrote my own library to translate the models into a more efficient format (~95% smaller than Scikit) and have thread-safe inference.

1 comments

Thanks for the reply! What is Fastfood though? I can't find anything on Google.
Of course. The paper is at https://arxiv.org/abs/1408.3060.

> Our method applies to any translation invariant and any dot-product kernel, such as the popular RBF kernels and polynomial kernels. We prove that the approximation is unbiased and has low variance. Experiments show that we achieve similar accuracy to full kernel expansions and Random Kitchen Sinks while being 100x faster and using 1000x less memory. These improvements, especially in terms of memory usage, make kernel methods more practical for applications that have large training sets and/or require real-time prediction.

Sadly Fastfood didn't quite make it into Scikit[1], but did land in scikit-learn-extra[2].

1. https://github.com/scikit-learn/scikit-learn/pull/3665. A shame, Scikit's equivalents scale very poorly.

2. https://scikit-learn-extra.readthedocs.io/en/stable/generate...