|
|
|
|
|
by pytness
757 days ago
|
|
Is it me or is the description of the algo wrong? > Round 1. Keep going through Hamlet, adding new words as you go. If you come to a word that’s already on your list, flip a coin again. If it’s tails, delete the word;
If i follow this description of "check if exists in list -> delete": if hash_set.contains(word) {
if !keep_a_word(round) {
hash_set.remove(word);
continue;
}
} else {
hash_set.insert(word.to_string());
}
The algorithm runs for ~20 iterations: Total word count: 31955 | limit: 1000
End Round: 20, word count: 737
Unique word count: 7233
Estimated unique word count: 772800512
But if I save the word first and then delete the same word: hash_set.insert(word.to_string());
if !keep_a_word(round) {
hash_set.remove(word);
continue;
}
It gets the correct answer: Total word count: 31955 | 1000
End Round: 3, word count: 905
Unique word count: 7233
Estimated unique word count: 7240
|
|
When implementing the exact method as described in quanta magazine (without looking at the arxiv paper), I always had estimates like 461746372167462146216468796214962164.
Then after reading the arxiv paper, I got the the correct estimate, with this code (very close to mudiadamz's comment solution):
Or equivalently: Now I found the quanta magazine formulation problem. By reading:> Round 1. Keep going through Hamlet, adding new words as you go. If you come to a word that’s already on your list, flip a coin again. If it’s tails, delete the word; heads, and the word stays on the list. Proceed in this fashion until you have 100 words on the whiteboard. Then randomly delete about half again, based on the outcome of 100 coin tosses. That concludes Round 1.
we want to write:
whereas it should be (correct): Just this little "else" made it wrong!