|
|
|
|
|
by dan-robertson
407 days ago
|
|
Alias tables are neat and not super well known. We used to have an interview question around sampling from a weighted distribution (typical answer: prefix sum -> binary search) and I don’t think anyone produced this. I like the explanation in that blog. The way it was explained to me was first ‘imagine drawing a bar chart and throwing a dart at it, retrying if you miss. This simulates the distribution but runs in expected linear time’. Then you can describe how to chop up the bars to fit in the rectangle you would get if all weights were equal. Proof that the greedy algorithm works is reasonably straightforward. |
|
Btw, a slightly related question:
Supposed you have a really long text file, how would you randomly sample a line? Such that all lines in the text file have the exactly same probability. Ideally, you want to do this without spending O(size of file) time preprocessing.
(I don't think this is a good interview question, but it is an interesting question.)
One way: sample random characters until you randomly hit a newline. That's the newline at the end of your line.