Hacker News new | ask | show | jobs
by infogulch 2169 days ago

    1 baz
    1 None (?)
    2 foo
    2 bar
    2 baz
If __hash__(x) is recalculated on every indexing (?).
3 comments

I think that's right.

On a side node, if I recall my old python knowledge right the <1 - baz> and <1 - None> cases are only possible because -- well for one it uses only one object, so the fast check "key is elem" is always true -- but also the object utilizes the default __eq__ function, which tests on object identity. And since the same instance of C is used for all object invocations, this is always true.

The behavior would be different using:

        x = C()
        y = C()

        d[x] = 'foo'
        d[y] = 'bar'
        d[x] = 'bar'
        d[y] = 'qux'
The hash might collide, but the equality will cause the open addressing to skip the element.

I dislike recalling debugging a hash function that wasn't entirely deterministic in some edge cases.

Essentially correct! Instead of `1 None` it'd fail with a `KeyError`. You can get a probability distribution too:

    1 baz       1/8
    KeyError    1/8
    2 foo       1/8
    2 bar       1/4
    2 baz       3/8
Indeed, I got

  Counter({
      (1, 'baz'): 12525,
      (1, 'KeyError'): 12552,
      (2, 'foo'): 12255,
      (2, 'bar'): 25282,
      (2, 'baz'): 37386
  })
from running

  import random
  
  class C:
      def __hash__(self): 
          # return 0 or 1, with equal probability
          return random.randint(0, 1)
  
  def output():
      x = C()
      d = {}
      d[x] = "foo"
      d[x] = "bar"
      d[x] = "baz"
      try:
          return len(d), d[x]
      except Exception as e:
          return len(d), type(e).__name__
  
  from collections import Counter
  Counter(output() for i in range(100000))
I think it would be

  KeyError
instead of

  1 None (?)
but otherwise looks like the right answer to me too.