|
|
|
|
|
by Dyac
1779 days ago
|
|
Interesting, thanks. 1. How do you get around needing session level data instead of aggregate data when working with non parametric KPIs? GA in particular is notorious for sampling data. 2. True, but you can't get away from the fact that a split test only run for a day or two isn't going to give you trustworthy results. It's things like this that abstract away the statistical reality for lay users that cause poor decisions to be made under the guise of being "data driven". I think as testers, and you as a provider of a testing system, have a duty not to lead businesses to believe that they are making statistically sound choices when they may not be. |
|
2. We have a minimum sample size threshold before we run any statistics on the data. To your point, we don't want to say something is "significant" if it's 5 conversions vs 1. This is one area we're looking to improve with better heuristics. We can't completely take the human out of the loop, but we can help give them all the info they need to make the best decision. On that front, we do show Bayesian expected loss (risk) and credible intervals in addition to just the "chance to beat control".