|
|
|
|
|
by taeric
1236 days ago
|
|
Yeah, I realize I was more than a little off in terms. Being able to randomly sample, though, feels like it also needs a lot of knowledge about what you are sampling from. Such that I meant for that to be included in my question. :D |
|
If you know absolutely nothing about your population, the only thing to look at is the mechanism of sampling. Is there some step in the process that would bias selection?
In the real world, you never know nothing about a population (you heard me, Frequentists) and you can check that the known attributes of the population match the sample. If they don’t, that could hint at something wrong.
You don’t even need to know the attributes ahead of time. Let’s say you want to spot check 100 API calls. You could find the ratio of user agents for the whole population and make sure your sample is close (detecting Sample Ratio Mismatch). Same for distribution of response times and so on. Just be aware that the more you look at the more likely you’ll find something weird! You need to correct for that if doing math or keep it in mind if eyeballing it.