|
|
|
|
|
by apwheele
605 days ago
|
|
Care to elaborate on this? So this post does not save the resulting weight, so you don't use that in any subsequent calculations. You would just treat the result as a simple random sample. So it is unclear why this critique matters. |
|
In practice, however, when your samples are small, your populations are large, and your populations' weights are not concentrated in a small number of members, you can use Algorithm A and just pretend that you have a sample of i.i.d. draws. In these circumstances, the potential for bias is generally not worth worrying about.
But when you cannot play so fast and loose, you can produce unbiased estimates from an Algorithm A sample by using an ordered estimator, such as Des Raj's ordered estimator [1].
You could alternatively use a different sampling method, one that does produce inclusion probabilities. But these tend to be implemented only in more sophisticated statistical systems (e.g., R's "survey" package [2]) and thus not useful unless you can fit your dataset into those systems.
For very large datasets, then, I end up using Algorithm A to take samples and (when it matters) Des Raj's ordered estimator to make estimates.
(I was planning a follow up post to my blog on this subject.)
[1] The original papers are hard to come by online, so I suggest starting with an introduction like https://home.iitk.ac.in/~shalab/sampling/chapter7-sampling-v..., starting at page 11.
[2] https://cran.r-project.org/web/packages/survey/survey.pdf