Users of JSON, probably the most common data interchange format on the planet, frequently have implicit requirements about key ordering. It is highly convenient to be able to parse a JSON string into a native Python data structure, add a field, emit it back, and preserve the ordering.
From what I recall of the JSON standard itself, there's no guarantee about key ordering being significant. If you're diffing serialized output to compare two JSON objects you need to be serializing it in a consistent format, otherwise even whitespace is going to throw you off.
If a human is inspecting serialized JSON using pen and paper, the human is presumably clever enough to match up key for key regardless of ordering.
If the human is using a computer to compare two JSON payloads (as the use of a diffing algorithm suggests), the human and computer should be clever enough as a team to realize that they could just deserialize and reserialize each JSON payload such that the keys were lexicographically sorted and the data was pretty-printed in the exact same way before running it through the diffing algorithm. `jq -Sc` would do the trick.
I tracked down a bug once where someone was hashing state that happened to include Python dicts serialized to JSON, which caused different hash values for the same entity. I know, I know, insert tears of blood emoji, and any engineer worth his salt wouldn't be so sloppy. But you don't always get to choose to work only with top talent that understands why you should design things elegantly as a baseline[1]. In these cases, removing implicit gotchas like "the same code has the same behavior" (ie iteration over the same dict) is valuable.
[1] Well you _can_ choose to do so, and I recently have, but it's not without its costs.
I can't speak for all "other" languages but Ruby made the change years ago from 1.8 to 1.9. Ruby calls them hashes but they're standard maps. Obviously Ruby and Python are very similar but I think the point stands.
Convert a deck of cards ([]Card?) to a map (map[Card]bool?) and back just to shuffle? That's unlikely to be faster or more idiomatic than a straightforward implementation of the Fisher-Yates shuffle[1]. Try writing the code to do it both ways and compare.
I think "idiomatic" would be using rand.Perm, which implements the Fisher%E2%80%93Yates shuffle. But aside from whether it's idiomatic, converting to a map isn't random enough.
With Golang 1.11, I get orderings like this. S is spades, H is hearts, D is diamonds, and C is clubs, because HN eats the Unicode characters I was actually using.
S9 SJ SK HQ D9 S4 S5 S10 HJ S8 H6 DA D2
D4 DQ C6 C8 CJ H3 H9 DK C3 CQ SA S2 S3
S6 SQ H4 H10 D8 D10 C4 CK S7 H7 D5 D6 D7
C7 H2 HK D3 DJ C2 C5 C9 HA H5 H8 CA C10
On average you would expect about one card in a shuffled deck to be
(cyclically) followed in the deck by its succeeding card, and on
about one out of 50 shuffles, that card would be followed by its
successor. Here we have S4 S5, DA D2, SA S2 S3, and D5 D6 D7.
In particular it seems like there ought to be a less verbose way to express the equivalent of the Python list({card: True for card in deck}) in Golang.
IIRC it used to just start iteration at a pseudorandom index and then iterate normally. I looked at it couple years ago, don't know if they changed it.
It seems to do that, but I think it also tweaks the hash function for each newly created map, because in the code linked from https://news.ycombinator.com/item?id=22278753 I'm not getting rotations of the same iteration order when I generate two maps. There doesn't seem to be any randomness in mapiternext() itself.
You could imagine that the hash function itself might do an adequate job of randomizing the order of the cards, though, especially if salted with a per-map salt. SipHash, for example, would probably not have any detectable biases in the distribution of the permutations thus produced. But whatever hash function Golang is using for my structs has an easily visible bias, as described in that comment.
What does it say? That Python is willing to spend more computation time for simplicity, without waiting for the programmer to ask for it? That Python will make semantic changes in minor version bumps of the language?
The ordering property is a side effect of the new and more cpu/memory efficient data structure. It would be very surprising to say no to a 2x performance jump to avoid the (sometimes useful) ordering.
Java added LinkedHashMap a while ago. I tend to default to it except for small temporary use cases where order absolutely won't matter or have an outside impact...
LinkedHashMap is not Java's "standard maps" though. If you tend to use it by default it's really a personal idiosyncrasy, especially as LLHM tend to be larger and slower than regular hashmaps due to having to maintain a doubly linked list.
Python has had one such in the standard library for a decade or so.
LHM is not slower for iteration (it's faster actually).
LHM indeed pays 2 references per node but they are well worth as it has deterministic ordering/iteration and I have witnessed numerous cases with HashMap that show up in production only due to iteration order (esp after rehashing). The code is broken but the cases did not reproduce during testing...
Now if the two extra references are an issue, consider that HashMap is quite space inefficient with having a dedicated node per each entry - that's the main price. The (simple) node memory footprint is 36bytes on heaps less than 32GB, i.e. compact pointers. The extra references add another 8 bytes for having an insert or access order.
If the goal is getting a compact low memory footprint, HashMap is not fit for purpose. Overall it's the jack of all trades and even got reworked (java8) to support tri-based collision resolution for keys that implement Comparable.
Couple years back I wrote CompactHashMap (under CC0) that has an average cost of 10bytes per entry and it's extremely compact for small sizes with having only 2 fields on its right own, so even small/empty maps are tiny. In microbenchmarks (same used in openjdk) it handily (2x) beats java.util.HashMap on "Traverse key or value", get/put have similar performance, and "Search Absent" is worse.
The point is: LHM should be the go-to hashmap for java as being node based hashmap is justified (unlike java.util.HashMap)
Yep. Also, LinkedHashMap maintains ordering based on insertion order. So, iff you insert your data ordered how you want it, it's behaving as an ordered map. If you want true, self-reordering map, what you want is a TreeMap. Beauty with that one is that you have full control over defining the custom ordering function, because we're not always indexing a map by primitives or autoboxed type. I try to use it sparingly though as you're paying log(n) on pretty much all map operations.
While we're here, another tidbit that's often overlooked in the Java collections:
If you reallly care about iteration performance, your data is without nulls, your data is already ordered how you like it or you don't care about ordering, your qty items >= 10, and you don't need random access, then ArrayDeque is gonna be your horse because of how much better it co-locates its contents in memory and how much less overhead is required to maintain it during each operation compared to all the other List implementations, including ArrayList and LinkedList.
Interestingly enough here it was kinda but kinda not the other way around: historically PHP used a closed-addressing hash map and threaded a doubly linked list through it to maintain the insertion order.
But the dict TFA talks about doesn't use a doubly linked list, or closed addressing, its ordering is a side-effect of its implementation but not originally a core goal (memory saving and iteration speed were). It'd probably been proposed by others before but it came to wider attention after Raymond Hettinger (a core dev) proposed it a few years back[0]. PHP actually released it first[1], closely followed by pypy[2]. CPython only got around to it some time later[3]
https://www.stefanjudis.com/today-i-learned/property-order-i...