Hacker News new | ask | show | jobs
by jrdorn 1779 days ago
Hi, thanks for the questions!

1. We don't store any raw user data. We pull things like mean and standard deviation from data sources, run the statistics, and store the result.

2. We use a Bayesian statistics engine which is much more immune to peeking problems and Type I errors than frequentist approaches.

3. Tests can be run either client or server side. For client side, we recommend bundling the SDK with your app (webpack, etc). We really care about performance so never want to add additional http requests or script tags of any kind if at all possible.

1 comments

Interesting, thanks.

1. How do you get around needing session level data instead of aggregate data when working with non parametric KPIs? GA in particular is notorious for sampling data.

2. True, but you can't get away from the fact that a split test only run for a day or two isn't going to give you trustworthy results. It's things like this that abstract away the statistical reality for lay users that cause poor decisions to be made under the guise of being "data driven". I think as testers, and you as a provider of a testing system, have a duty not to lead businesses to believe that they are making statistically sound choices when they may not be.

1. GA is very limited as a data source because of sampling and the fact that they don't expose variance. So if using GA, we only support simple binomial metrics, count data (assuming Poisson distribution), and duration data (assuming exponential distribution). For SQL data sources and non-parametric data, we currently rely on the CLT and treat the sampling distribution as Normal. There's a good article that goes over the stats in more detail (Itamar, the author, wrote our stats engine) - https://towardsdatascience.com/how-to-do-bayesian-a-b-testin...

2. We have a minimum sample size threshold before we run any statistics on the data. To your point, we don't want to say something is "significant" if it's 5 conversions vs 1. This is one area we're looking to improve with better heuristics. We can't completely take the human out of the loop, but we can help give them all the info they need to make the best decision. On that front, we do show Bayesian expected loss (risk) and credible intervals in addition to just the "chance to beat control".

Brilliant, thank you.

Can you use the system to analyse results of tests it didn't run? ie. If I run tests using some SAAS that only supports frequentist stats could I use your system as a bayesian analysis backend?

Yes. As long as the variation assignment data and success metrics are in a supported data source (SQL, GA, or Mixpanel currently), it can be queried and analyzed in Growth Book.