|
I want to make a distinction between three kinds of data structures. 1. Implementation-specific thing that has no formal equivalent. Structs of structs of structs, with business logic intermingled with the structure. Plenty of examples of this. I’m not suggesting anyone put these in the stdlib; that’d be silly. 2. ADT with no formalism behind it, that implements a particular set of behaviours “for best performance”, with no guarantee about the implementation or the time/space complexity, and a set of exposed operations that only state what they do, not how they do it, such that you can’t guess what algorithm a given operation is going to use. Example of #2: Objective-C’s NSDictionary. It only guarantees that it “acts like” a dictionary; it doesn’t guarantee that it’s O(1), because being O(1) with a high constant can actually be worse for performance in some cases. So instead, it makes no guarantees, and tries to optimize for performance by actually switching between different implementations as the data held reaches different size thresholds. I’m not suggesting anyone put these in the stdlib either, if they currently live in a specific application, because this widens the scope of their stability requirements. Right now only the compiler maintainers can optimize this ADT into doing something entirely different, and then ensure that the compiler (the only consumer) is happy with the result; if the ADT were available in the stdlib, they wouldn’t be able to test all consumers, so they’d have to be much more conservative with their changes, lest they break an edge-case. (Changing how sorting an array ADT works, for example, can break code that assumed that sorting either nearly-sorted or entirely-randomized sets was optimal.) And, of course, sometimes in a compiler the Control Flow Graph ADT will be of this type. If it is, fine, don’t export it. But usually—because of how compiler maintainers interact heavily with compiler theorists—it’s instead the third kind: 3. Formal data structures, where the name comes from a paper which defines the expected behaviour and a base-level implementation; where either that implementation, or another paper’s pure improvement on said implementation (without changing any of the semantics) is what you can expect to find. As well, the names of all algorithms implemented against the ADT are also well-defined in papers, such that you can know what algorithm you’re using by the name of the function. (Usually, in these cases, you’ll have multiple algorithms that do the same thing differently living as sibling functions in the same module.) Examples of #3: the geometric primitives and index search-tree types that PostGIS exposes to C-level code. Any implementation of vector clocks. Or, my favourite: regular expressions (i.e. deterministic and nondeterministic finite automata.) Also, example to make the naming point: for a particular use-case, the use of a segment tree might be an optimization over use of an interval tree. But in code that uses these types of formal data structures, you’d never find an ADT named “IndexTree” that could happen to be either; instead, you’d expect separate SegmentTree and IntervalTree implementations, and then maybe a strategy-pattern ADT wrapping one or the other. But the concrete implementation of the formalism is very likely to exist, because people tend to just sit down and implement these things by translating pseudocode in papers into modules; and then tend to want things to stay that way, because keeping a 1:1 mapping from the module to the paper, and then linking the paper in module-doc, is one of the only good ways to get new maintainers to understand what the heck has been implemented here. It’s #3 that I would suggest is a good candidate for stdlib inclusion. These data structures don’t change in a way that breaks their guarantees, because their existence is predicated on a particular formalism, and nobody cares about improvements to the performance of a formalism that break the formalism. (Fun analogy: nobody would care about improvements to speed running a video game that required changing the code of the game. You’re trying to optimize that game, not some other one!) |
You started arguing that the Rust compiler should expose its internal data-structures, which it does, and somehow ended arguing that the Rust standard library should expose spatial data-structures for geometry processing.
I have no idea how you got from one to the other, but writing a huge wall of text full of disorganized thoughts shows very little appreciation for the time of those participating in this discussion thread, so I won't interact with you anymore.