|
|
|
|
|
by mjw
5217 days ago
|
|
This kind of thing is awesome in a way. I get the sense that machine learning really feeds on people attacking problems from both ends, the elegant probabilistic side and the practical optimisation hacks both inspire each other. Some potential downsides for a hack which isn't backed with any theory though, just to demonstrate why it might be worth trying to do some theory after spotting one of these hacks, from a practical not just an aesthetic perspective: - It may have an impact on convergence properties and numerical stability of any optimisation algorithm you're using to fit the model. Convergence speed, quality of local maxima attained, whether it even converges to a local minimum of your cost function at all, whether there are any guarantees that it doesn't sometimes blow up numerically in a horrible way... - In general it may be brittle, with the circumstances under which it works well poorly understood. Will it break as your dataset grows? will it work on slightly different kinds of datasets? - Too many arbitrary parameters to tweak can be expensive unless you have a smart way to optimise them (smarter than grid search + cross-validation) - Maintainability. It can be frustrating trying to re-use work when people have been less than completely honest in documenting things like "this term/factor/constant was pulled out of my arse and seems to work well on this one dataset, caveat emptor". |
|