Hacker News new | ask | show | jobs
by ced 5226 days ago
OK, that's a good point.

Have you considered hierarchical modeling, like in Bayesian Data Analysis? I would have lambda and mu drawn from per-company gamma distributions, and have the parameters of these gammas drawn from global distributions (gamma distributions themselves?)

Also, you're using maximum likelihood. Have you done the full MCMC computations? (I don't think that it would make much of a difference - but it's nice to have empirical validation of that)

I would enjoy reading more about the HMM.

1 comments

We've considered hierarchical modeling, but concluded that there were no real gains. We have enough data from each of our clients to identify the parameters of the model. We will probably add more hierarchical modeling as we improve predictions of seasonality, primarily it will be useful for predicting the 'christmas effect' for new clients.

The posterior mode of the Pareto/NBD obtained through full MCMC is extremely close to the MLE, and the MLE is much faster to calculate so we use MLE. [1]

There has been some work done on using HMM to predict CLV. It turns out that in most cases the Pareto/NBD is a robust model for CLV. [2]

[1] http://dl.acm.org/citation.cfm?id=1305575

[2] http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1904562

ced: We do some post-hoc analysis to find differences on whatever dimensions our clients give us. We have found that in general, the largest effect is the month of acquisition, and so this is the only factor that we include in the model right now.
Thank you for the info.

One last question: do you use gender, age, and other customer-specific predictors in your model? The distribution of lambdas for men and women could vary significantly.