Hacker News new | ask | show | jobs
by ganduG 3570 days ago
Because this is an implementation detail. The language spec doesn't enforce this.

The real reason they did this was because of the performance gains from the approach - the ordering is just a nice side effect. Its an idea originally from PyPy afaik.

`OrderedDict` is now just a thin wrapper around `dict`.

i.e. if you want your code to be portable among different Python implementations then you should still use `OrderedDict`.

3 comments

> Its an idea originally from PyPy afaik.

No, the idea is from Raymond Hettinger on the Python-Dev ML back in 2012: https://mail.python.org/pipermail/python-dev/2012-December/1...

PyPy were the first to bother actually implementing it.

Ah okay, good to know. I knew PyPy did this so I assumed it came from there.
Do you have a link to info about why it's faster? My initial reaction was that the extra indirection would slow it down, but of course that's based on nothing but general knowledge and two seconds' thought.
> My initial reaction was that the extra indirection would slow it down, but of course that's based on nothing but general knowledge and two seconds' thought.

There are already indirections in the current implementation.

The new dict has much better memory locality: the "actual data" (24 bytes on 64b platforms) is stored in a dense array instead of the former sparse array, and the sparse array only stores indices so it can be made much more compact (especially as it can switch the index size depending on the size of the compact array, for small dicts it'll store byte indices). So not only is the new dict smaller, the way the data is laid out is better fetchable.

The original point[0] was actually iteration speed: since the data is stored in a dense array, that can be iterated directly and efficiencly instead of having to iterate a sparse array and skip the random empty entries (leading to awful branch predictability, whether a given entry of the sparse array is empty is essentially random if the hash is any good).

[0] https://mail.python.org/pipermail/python-dev/2012-December/1...

Great, thank you. That's a cool technique and I'll have to remember it.
Thanks for this - I was also wondering about how this would effect cache usage.
> `OrderedDict` is now just a thin wrapper around `dict`.

It is not. odictobject.c has not been touched at all yet. Also, OrderedDict defines several methods that ordinary dict does not (and unlikely ever will).