Hacker News new | ask | show | jobs
by marmaduke 2782 days ago
It’s not so simple for our models (hierarchical Bayesian time series models, often nonlinear, which may not be typical): we spend a lot of time digging through the data itself, forward simulations of model, and refactoring/tweaking model structure. PyML (as described in the link you provided) doesn’t appear to support the first two parts, which are prerequisites to improving the model IMO.

Usually when we are doing more of the train/fit/test cycle, there’s an argparse script to quickly try different parameter values succinctly (which is run and tracked by the above CI setup)

I wouldn’t say we’re reinventing since a better solution isn’t very clear (though PyML et al look interesting)

edit forward simulation isn't a frequent thing in posts on generic ML algorithms, so just as an example: suppose you run a model and see an oscillatory component along a temporal dimensions in your residual error, and you add a oscillatory component to your model, and rerun it but still see a residual with an oscillation. You can run a forward simulation of your model to see what frequency it's predicting and check against what's seen in the data, and fix it. This is a contrived example but when you have multiple competing priors or model components, this is an effective way to debug their behavior.