|
|
|
|
|
by travisjungroth
1236 days ago
|
|
The most important thing for a sample is that it’s representative. The sample must have the same characteristics as the population. If it doesn’t, it just destroys the usefulness of the analysis. If you know absolutely nothing about your population, the only thing to look at is the mechanism of sampling. Is there some step in the process that would bias selection? In the real world, you never know nothing about a population (you heard me, Frequentists) and you can check that the known attributes of the population match the sample. If they don’t, that could hint at something wrong. You don’t even need to know the attributes ahead of time. Let’s say you want to spot check 100 API calls. You could find the ratio of user agents for the whole population and make sure your sample is close (detecting Sample Ratio Mismatch). Same for distribution of response times and so on. Just be aware that the more you look at the more likely you’ll find something weird! You need to correct for that if doing math or keep it in mind if eyeballing it. |
|