Hacker News new | ask | show | jobs
by Noseshine 3506 days ago

  > What would you expect to happen if the object is modified after being inserted into the dictionary?
I would expect it to continue to use the object. Why would it matter if it was mutated or not?

You can try it right in your browser with Javascript and the Map() object type. You can mutate the object all you want, as long as it's the same object(-reference) you access the same value in the Map object. Just as I would expect. "Mutable" doesn't change which object you have, only what it looks like inside.

2 comments

This is where Pythonistas rely on the idea that "we're all consenting adults here". It's very useful to let objects assert that they have an arbitrary, immutable hash value. I've used that ability in Python and it solved problems that would have been very challenging otherwise. Of course, that ability also leads to undocumented behavior if hash values change. Python coders expect dicts to behave in undocumented ways when object hashes change, but it would be a bug if that led to a segfault or a C assertion failure.
I'm genuinely curious how/why this is a problem for Python yet I've never heard of it being a problem for C#. In C# you can override Object.GetHashCode(), and there's no way in C# to even express the constraint/desire "the result of GetHashCode doesn't change (especially after insertion into a container)". Yet I've never heard of anyone in C# having a problem using mutable objects in sets or dictionaries?
To be honest, most Python coders probably never get deep enough in the language to start overriding hash values, and those who do probably understand the implications before they get that deep.
This fundamentally reduces the usefulness of hash representations though. Instead of thinking of dict keys, think sets.

It's also worth pointing out that user-defined objects operate the way you seem to desire (e.g. default to using the id() property of the object to derive the hash). So in my mind, this strikes the perfect balance?

StdLib types that are immutable bake in more comparative hash properties while making it trivial for a user to wrap them and override this richer behavior with default hash based on id().

Consider:

```

>>> class myList(object):

... def __init__(self):

... self.inner_list = list()

...

>>> x = myList()

>>> x

<myList object at 0x10bdabf10>

>>> x.inner_list

[]

>>> y = {x: 'hi'}

>>> y

{<myList object at 0x10bdabf10>: 'hi'}

>>> x.inner_list.append(1)

>>> x.inner_list

[1]

>>> y

{<myList object at 0x10bdabf10>: 'hi'}

>>> for k in y.keys():

... print k.inner_list

...

[1]

```

  > This fundamentally reduces the usefulness of hash representations though.
If you want something immutable just put something immutable in?

  > Instead of thinking of dict keys, think sets.
It's the same that I already wrote for sets, and also the same I just wrote: If you want something immutable just put something immutable in.

By the way, I'm not talking about any particular implementation.

I think there is a disconnect here, I showed you how you could place a mutable object as a dict key in Python. One which uses the object itself (id() property) for the hash.

> If you want something immutable just put something immutable in.

My point is what do you expect if you do this:

>>> x = (1, 2)

>>> y = (1, 2)

>>> set([x, y])

I would argue that you would expect:

>>> set([(1,2)])

Since x is equal to y. But if you base the hash by default on id, you would get:

>>> set([(1, 2), (1, 2)]) # set([x, y])

Since x and y are different instances (have different id()) properties. I would argue that it's more useful/predictable for built-ins to hash based on content rather than id().

However, you can still hash based on id() but simply creating your own object (as I demonstrated). So you get sensible defaults for built-ins but also the option of hashing mutable objects.

We were originally talking about a dictionary key, and the purpose of an object as key in a dictionary is AFAICS to associate a value with that concrete object. Whenever I use those structures I have to solve an engineering problem, not a math problem. I would claim that's the majority of use cases, a claim that seems to be supported by how it is commonly implemented (see Java below and Javascript for two of the most common languages).

Anyway, it's configurable for example in Java (http://stackoverflow.com/questions/9440380/using-an-instance...) - default is to use the reference but you can override object methods to change that. Similarly for C#(http://stackoverflow.com/questions/634826/using-an-object-as... or http://stackoverflow.com/questions/8952003/how-does-hashset-...).

Ah I see the disconnect, I think the underlying piece I glossed over was the fact that dict objects and set objects rely on the same hashable property of objects. This is in part due to the "one way and only one way" rule of thumb of Python's language design.

For me, talking about how dicts handle hashing keys and sets hashing members are equivalent in the context of the Python language.

Given that context I was saying changing the "hashable" nature of default objects would be counter-intuitive as the new behavior would be in contrast to the expected behavior of sets and/or immutable equal types in dict keys.