Here's a cute follow-up: what are all possible outputs of the following program?
import random
class C:
def __hash__(self):
# return 0 or 1, with equal probability
return random.randint(0, 1)
x = C()
d = {}
d[x] = "foo"
d[x] = "bar"
d[x] = "baz"
print(len(d), d[x])
On a side node, if I recall my old python knowledge right the <1 - baz> and <1 - None> cases are only possible because -- well for one it uses only one object, so the fast check "key is elem" is always true -- but also the object utilizes the default __eq__ function, which tests on object identity. And since the same instance of C is used for all object invocations, this is always true.
The behavior would be different using:
x = C()
y = C()
d[x] = 'foo'
d[y] = 'bar'
d[x] = 'bar'
d[y] = 'qux'
The hash might collide, but the equality will cause the open addressing to skip the element.
I dislike recalling debugging a hash function that wasn't entirely deterministic in some edge cases.
import random
class C:
def __hash__(self):
# return 0 or 1, with equal probability
return random.randint(0, 1)
def output():
x = C()
d = {}
d[x] = "foo"
d[x] = "bar"
d[x] = "baz"
try:
return len(d), d[x]
except Exception as e:
return len(d), type(e).__name__
from collections import Counter
Counter(output() for i in range(100000))