Hacker News new | ask | show | jobs
by sametmax 3510 days ago
I'm not particularly interested in Elm, but basically this post is about one problem in the language. One. This doesn't make the language a pile of garbage as the author seems to let you think.

I'm a Python programmer. I fully expect dict keys to be able to be arbitrary objects. But after coding a lot in JS, I realized I could live without it. It's nice, but it's not a show stopper if I don't have it.

Same goes here. Yes, typing is not as good as you wish it was, but Elm is a young language, give it time to evolve. In the meantime, what about the innovative things in it ?

6 comments

> I'm a Python programmer. I fully expect dict keys to be able to be arbitrary objects.

I'm a Python programmer too, and I also fully expect dict keys to be able to be arbitary objects, and I get really frustrated with the fact that they can't be arbitary objects. They have to be hashable objects, and the hash function refuses to hash certain objects that it has decided aren't allowed.

Well, you cannot use a mutable object as a key without some help. What would you expect to happen if the object is modified after being inserted into the dictionary?

If you have a list or dictionary you want to use a key, you can convert it into a tuple which would make it immutable.

If you don't care about performance for larger collections you could just use a list instead of a dictionary, which does not require hashing. Changes to the original mutable object will also be reflected in your list.

  > What would you expect to happen if the object is modified after being inserted into the dictionary?
I would expect it to continue to use the object. Why would it matter if it was mutated or not?

You can try it right in your browser with Javascript and the Map() object type. You can mutate the object all you want, as long as it's the same object(-reference) you access the same value in the Map object. Just as I would expect. "Mutable" doesn't change which object you have, only what it looks like inside.

This is where Pythonistas rely on the idea that "we're all consenting adults here". It's very useful to let objects assert that they have an arbitrary, immutable hash value. I've used that ability in Python and it solved problems that would have been very challenging otherwise. Of course, that ability also leads to undocumented behavior if hash values change. Python coders expect dicts to behave in undocumented ways when object hashes change, but it would be a bug if that led to a segfault or a C assertion failure.
I'm genuinely curious how/why this is a problem for Python yet I've never heard of it being a problem for C#. In C# you can override Object.GetHashCode(), and there's no way in C# to even express the constraint/desire "the result of GetHashCode doesn't change (especially after insertion into a container)". Yet I've never heard of anyone in C# having a problem using mutable objects in sets or dictionaries?
To be honest, most Python coders probably never get deep enough in the language to start overriding hash values, and those who do probably understand the implications before they get that deep.
This fundamentally reduces the usefulness of hash representations though. Instead of thinking of dict keys, think sets.

It's also worth pointing out that user-defined objects operate the way you seem to desire (e.g. default to using the id() property of the object to derive the hash). So in my mind, this strikes the perfect balance?

StdLib types that are immutable bake in more comparative hash properties while making it trivial for a user to wrap them and override this richer behavior with default hash based on id().

Consider:

```

>>> class myList(object):

... def __init__(self):

... self.inner_list = list()

...

>>> x = myList()

>>> x

<myList object at 0x10bdabf10>

>>> x.inner_list

[]

>>> y = {x: 'hi'}

>>> y

{<myList object at 0x10bdabf10>: 'hi'}

>>> x.inner_list.append(1)

>>> x.inner_list

[1]

>>> y

{<myList object at 0x10bdabf10>: 'hi'}

>>> for k in y.keys():

... print k.inner_list

...

[1]

```

  > This fundamentally reduces the usefulness of hash representations though.
If you want something immutable just put something immutable in?

  > Instead of thinking of dict keys, think sets.
It's the same that I already wrote for sets, and also the same I just wrote: If you want something immutable just put something immutable in.

By the way, I'm not talking about any particular implementation.

I think there is a disconnect here, I showed you how you could place a mutable object as a dict key in Python. One which uses the object itself (id() property) for the hash.

> If you want something immutable just put something immutable in.

My point is what do you expect if you do this:

>>> x = (1, 2)

>>> y = (1, 2)

>>> set([x, y])

I would argue that you would expect:

>>> set([(1,2)])

Since x is equal to y. But if you base the hash by default on id, you would get:

>>> set([(1, 2), (1, 2)]) # set([x, y])

Since x and y are different instances (have different id()) properties. I would argue that it's more useful/predictable for built-ins to hash based on content rather than id().

However, you can still hash based on id() but simply creating your own object (as I demonstrated). So you get sensible defaults for built-ins but also the option of hashing mutable objects.

> What would you expect to happen if the object is modified after being inserted into the dictionary?

This question is sometimes not important: When the mutable object is not mutated.

> This question is sometimes not important: When the mutable object is not mutated.

Which you can't guarantee when the object is mutable, so for consistency's sake it's left out.

Though you can always build your own Dict subclass and implement __hash__ on it and be angry with yourself when it fails. :)

I was going to reply to him, but you beat me to it. But since you posed a different question regarding the term hashable, I thought I would link some helpful starting docs (for you and/or others):

https://docs.python.org/3/glossary.html#term-hashable

http://stackoverflow.com/questions/4348232/python-dictionary...

Come on, how many time in your entire life did you need this ?

For set, it's a bit more annoying. But for dicts. twice in 10 years maybe ?

After a few years of Clojure(Script), where immutability means you can use almost anything as keys, I have found many cases where my python code would have been simpler and easier if I could use arbitrary objects (and therefore data structures) as keys.

I mean, its not a huge deal, but it adds up.

> I have found many cases where my python code would have been simpler and easier if I could use arbitrary objects as keys

You can use pretty much anything as a dictionary key

That's really not true[1]. You can only use hashable types, which excludes dicts, lists and sets.

For example (tested on both python 2.7 and 3.4):

    >>> {{'a': 1}: 2}
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: unhashable type: 'dict'
    >>> {[1,2,3]: 4}
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: unhashable type: 'list'
    >>> {{1,2,3}: 4}
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: unhashable type: 'set'
Objects inheriting from object are hashable by default, but the hash is based on the objects instance such that each instance will return a unique value for that instance:

    >>> class A(object):
    ...     def __init__(self):
    ...         self.x = 1
    ...
    >>> a = A()
    >>> b = A()
    >>> d = {a: 1, b: 2}
    >>> a.x = 10
    >>> b.x = 10
    >>> d[a]
    1
    >>> d[b]
    2
While this makes sense for the instance stored in the dict (because mutability would otherwise mean that they key changes values), I don't think this is particularly useful. I rarely look up dicts by instance, but rather by value. That is, I would construct another object with the same attributes and look up by that.

You can, of course, implement your own __hash__ to make it work, but you have to do it manually for any object you want it and most third-party objects won't have implemented it so you'd have to monkey patch them.

Contrast that with Clojure, where all built-in data structures (maps, sets, lists, dicts etc) work as keys out of the box.

[1] I guess its true in the sense that you can implement __hash__ to make it work. But its not particularly easy or idiomatic.

It has been quite common when wanting to use a dictionary as a cache or counter such as a function cache or to count how many times a function is called with certain arguments. If the values of the arguments aren't hashable it doesn't work.
As an aside, what happens in Python when you try to hash objects that contain circular references?
Not all problems are equal. Most problems can be compensated in some way or another, using abstractions. Problems that limits abstraction are grave, for it puts a glass ceiling that can't be got around.

A language designer may limit the power of abstraction intentionally; maybe the abstraction taxes performance too much; or maybe too much power scares the target users. That's a plausible choice, but that also turns some users away.

>There is no "map" function, but there are "List.map", "Dict.map", "Array.map" among others, and none of them are related to one another. This is not only annoying, but also breaks the abstract data-type guarantee: our call-sites are now tightly coupled to our implementation details. If I decide to change from a "List a" to an "Array a" for efficiency reasons, my whole codebase needs to know about it.

I can accept this in old languages. I can accept this in embedded languages. I cannot accept this in a language that is supposed to be about programmer productivity.

Basically this patient has one bullet hole in their heart. One. This doesn't make the patient dead as the author seems to let you think.

I agree; to me it says equally as much about the expectations of the author of the article. In C++ you can define operator < and in C# you can define IComparable. Therefore I was astonished to discover the datastructures in the dynamic language Io can't be extended to use custom comparators. It's just an implementation limitation: Io's datastructures are all written in C and so there's a fixed set of types they use for keys. It is what it is.

On the other hand C++ is only just about to get its first version of Concepts. (Haskell programmers in this thread: imagine if you waited 10+ years to get typeclasses.)

My point being that most languages are going to be missing something you wish they had.

JavaScript maps can have object keys, you're just using objects as dictionaries. Objects in python don't typically have non-string keys.

```js var m = new Map(); var o = {}; // some object m.set(o, 5); m.get(o); // 5 ```

Map are not supported by most IE version, and even IE 11 doesn't support half of the API. The android browser doesn't support it at all either. Bottom line, you couldn't not have used Map during the last past years, and used objects for dict.
Not too much people write plain JS anymore, and it is not that hard to build transpiling pipeline. Sometimes you have a small project and you don't want to deal with all this nonsense, but then you are writing code in one file and it is very small jquery style enhancements.

So, if you need objects as keys in dictionaries, then there is a big chance that you can use transpiling already (modules, other ES6 features).

Please, don't take it as an offence, just a suggestion – I really don't see a problem here.

> ```js var m = new Map(); var o = {}; // some object m.set(o, 5); m.get(o); // 5 ```

That guy who thought markdown was supported by hacker news. LOL ! I've been through that too...

One problem that makes it sound like a nightmare to work with.