Hacker News new | ask | show | jobs
by ankit_it09 1143 days ago
Thanks for your inputs I have fixed the article.
1 comments

No, you have not fixed the article.

You've made it a different sort of wrong.

"Another probing sequence used in Python dictionaries is quadratic probing" is contradicted by the very source code you link to.

Compare your "but it can also cause some slots to be skipped or repeated" with dictobject.c's "repeating that 2^i times generates each int in range(2^i) exactly once (see any text on random-number generation for proof)."

How are you coming up with all of these wrong things? Start with why you thought Python used MurmurHash3.

I wrote it as a general article on python dictionaries.

>>Regarding Compare your "but it can also cause some slots to be skipped or repeated" - I think its correct - Let me try to explain this means that some slots may never be probed or may be probed more than once, which wastes time and space. For example, if the array size is 8 and the original index is 3, then quadratic probing will try the following indexes: 3, 4, 7, 4 (repeated), 3 (repeated), 4 (repeated), and so on. As you can see, indexes 0, 1, 2, 5, and 6 are never probed.

And yes you are correct - Python does not use the MurmurHash3 algorithm for hashing objects in dictionaries. My bad I mixed up two different things and though it uses this mmh3 for hashing. I have updated it now and made it more like my understanding of python dictionaries.

Thanks, for your review and pointers to fix it. Will keep it in my mind to get it reviewed before publishing any article.

> I wrote it as a general article on python dictionaries.

Yet as written it shows no understanding of how Python dictionaries work, and instead presents falsehoods.

"The hash function takes in the key as input and returns a unique integer value"

This is false. They are not unique.

  >>> hash(9) == hash(2**64+1)
  True
"A third probing sequence used in Python dictionaries is double hashing"

This ie another falsehood. Python does not use double hashing. Python objects only have a single hash.

"For small integers, Python dictionaries use linear probing with a small array size (8 or 16). For larger integers or strings, Python dictionaries use quadratic probing with a larger array size (32 or more). For other types of keys, Python dictionaries use double hashing with a larger array size (32 or more)."

This is another falsehood. This specialization does not exist.

If it does exist, show me in the code where it does this.

> more like my understanding of python dictionaries

Your article shows no understanding of Python dictionaries. It is a patchwork of different approaches to making a hash table, almost of which are not applicable to Python. Nor does it mention the places where Python is unusual compared to most hash table, even when clearly pointed out in the code comments.

It is easier for me to believe you are using ChatGPT (or something similar) than to believe you understand the topic.

Yes I am newbie, don't have any understanding. Thanks, I will not write anything now. I have deleted that story. Peace.