| Here is an alternative: Use a multiplicative linear congruential generator with a period 2^n , where n is ceil(log2(m)), where m is the size of the list. Seed the generator, start generating values and drop all values that are larger than m. In the worst case you will have to drop half of the values, in the best case you will drop none. This depends on how close to a power of 2 m is. The generator is very short, one line of code, and very fast: one multiplication, one addition, and one bitshift. Space complexity is O(1). If your tweets are generated out from an integer value, then you can use this value to generate them. The tweets will be unique (assuming the tweet generator will generate unique tweets from unique values), and you will have m of them. |
There are "N choose k" combinations, which in the worst case (k = N/2) grows as ~ 2^N. The number of combinations quickly outstrips the number of possible states of the LCG, and most combinations will never be found. For example, we have a list of length 16 and wish to choose 8. We have a choice of 16 distinct seeds, so there are only 16 different combinations (out of 12870 possible) that we can get for any given LCG.